search
HomeJavajavaTutorialHow to implement JDBC batch insert in Java
How to implement JDBC batch insert in JavaMay 18, 2023 am 10:02 AM
javajdbc

    1. Description

    In JDBC, the executeBatch method can execute multiple dml statements in batches, which is much more efficient than executing executeUpdate individually. This is What is the principle? How to implement batch execution in mysql and oracle? This article will introduce to you the principle behind this.

    2. Experiment introduction

    This experiment will be carried out through the following three steps

    a. Record the time consuming of jdbc batch execution and single execution in mysql

    b. Record the time consuming of batch execution and single execution of jdbc in Oracle

    c. Record the time consuming of batch execution and single execution of Oracle plsql

    The relevant java and database versions are as follows: Java17, Mysql8, Oracle11G

    3. Formal experiment

    Create a table in mysql and oracle respectively

    create table t (  -- mysql中创建表的语句
        id    int,
        name1 varchar(100),
        name2 varchar(100),
        name3 varchar(100),
        name4 varchar(100)
    );
    rrree

    You need to turn on the audit of the database before the experiment

    Mysql turns on auditing:

    create table t (  -- oracle中创建表的语句
        id    number,
        name1 varchar2(100),
        name2 varchar2(100),
        name3 varchar2(100),
        name4 varchar2(100)
    );

    oracle turns on auditing:

    set global general_log = 1;

    java code is as follows:

    alter system set audit_trail=db, extended;  
    audit insert table by scott;  -- 实验采用scott用户批量执行insert的方式

    Several points to note in the code,

    • The url of mysql needs to add useServerPrepStmts=true&rewriteBatchedStatements=true parameters.

    • batchCnt represents the number of SQL statements executed in each batch, and 0 represents single execution.

    First test mysql

    import java.sql.*;
    
    public class JdbcBatchTest {
    
        /**
         * @param dbType 数据库类型,oracle或mysql
         * @param totalCnt 插入的总行数
         * @param batchCnt 每批次插入的行数,0表示单条插入
         */
        public static void exec(String dbType, int totalCnt, int batchCnt) throws SQLException, ClassNotFoundException {
            String user = "scott";
            String password = "xxxx";
            String driver;
            String url;
            if (dbType.equals("mysql")) {
                driver = "com.mysql.cj.jdbc.Driver";
                url = "jdbc:mysql://ip/hello?useServerPrepStmts=true&rewriteBatchedStatements=true";
            } else {
                driver = "oracle.jdbc.OracleDriver";
                url = "jdbc:oracle:thin:@ip:orcl";
            }
    
            long l1 = System.currentTimeMillis();
            Class.forName(driver);
            Connection connection = DriverManager.getConnection(url, user, password);
            connection.setAutoCommit(false);
            String sql = "insert into t values (?, ?, ?, ?, ?)";
            PreparedStatement preparedStatement = connection.prepareStatement(sql);
            for (int i = 1; i <= totalCnt; i++) {
                preparedStatement.setInt(1, i);
                preparedStatement.setString(2, "red" + i);
                preparedStatement.setString(3, "yel" + i);
                preparedStatement.setString(4, "bal" + i);
                preparedStatement.setString(5, "pin" + i);
    
                if (batchCnt > 0) {
                    // 批量执行
                    preparedStatement.addBatch();
                    if (i % batchCnt == 0) {
                        preparedStatement.executeBatch();
                    } else if (i == totalCnt) {
                        preparedStatement.executeBatch();
                    }
                } else {
                    // 单条执行
                    preparedStatement.executeUpdate();
                }
            }
            connection.commit();
            connection.close();
            long l2 = System.currentTimeMillis();
            System.out.println("总条数:" + totalCnt + (batchCnt>0? (",每批插入:"+batchCnt) : ",单条插入") + ",一共耗时:"+ (l2-l1) + " 毫秒");
        }
    
        public static void main(String[] args) throws SQLException, ClassNotFoundException {
            exec("mysql", 10000, 50);
        }
    }

    Put in different batchCnt values ​​to see the execution time

    batchCnt=50 Total number: 10000, each batch Insert: 50, total time spent: 4369 milliseconds
    batchCnt=100 Total number of items: 10000, insert per batch: 100, total time spent: 2598 milliseconds
    batchCnt=200 Total number of items: 10000, insert per batch: 200, total time spent: 2211 milliseconds
    batchCnt=1000 Total number of entries: 10000, each batch of insertions: 1000, total time spent: 2099 milliseconds
    batchCnt=10000 Total number of entries: 10000, each batch of insertions: 10000, Total time spent: 2418 milliseconds
    batchCnt=0 Total number of entries: 10000, single insertion, total time spent: 59620 milliseconds

    View general log

    batchCnt=5

    batchCnt=0

    Several conclusions can be drawn:

    • The efficiency of batch execution is greatly improved compared to single execution.

    • The batch execution of mysql actually rewrites sql and merges multiple inserts into insert xx values(),()... for execution.

    • When changing batchCnt from 50 to 100, the time is basically shortened by half. However, when this value is expanded, the time reduction is not obvious, and the execution time will even increase. high.

    Analysis reason:

    After the client sends the SQL statement to be executed to the database server, the database executes the SQL statement and returns the result to the client. Total time taken = database execution time network transmission time. Reducing the number of round trips through batch execution reduces network transfer time and therefore overall time. However, when batchCnt becomes larger, even if the network transmission time is no longer the main bottleneck, the reduction in the total time will not be so obvious. Especially when batchCnt=10000, that is, all 10,000 statements are executed at one time, the time becomes longer. This may be because the program and database need to apply for larger memory when preparing these input parameters, so it takes more time. (My guess).

    One more thing, can the value of batchCnt be infinite? Suppose I need to insert 100 million items, can I insert 100 million items in batches at one time? Of course not, we don’t consider the space problem of undo. First of all, your computer does not have such a large memory to save all the 100 million SQL input parameters at once. Secondly, mysql also has a parameter max_allowed_packet to limit the length of a single statement. The maximum is 1Gbyte. When the statement is too long, "Packet for query is too large (1,773,901 > 1,599,488). You can change this value on the server by setting the 'max_allowed_packet' variable" will be reported.

    Next test oracle

    exec("mysql", 10000, batchCnt);

    Substitute different batchCnt values ​​​​to see the execution time

    batchCnt=50 Total number: 10000, each batch insertion: 50, total Time consumption: 2055 milliseconds
    batchCnt=100 Total number of items: 10000, each batch insertion: 100, total time consumption: 1324 milliseconds
    batchCnt=200 Total number of items: 10000, each batch insertion: 200, total time consumption : 856 milliseconds
    batchCnt=1000 Total number of entries: 10000, each batch of insertions: 1000, total time spent: 785 milliseconds
    batchCnt=10000 Total number of entries: 10000, each batch of insertions: 10000, total time spent: 804 Milliseconds
    batchCnt=0 Total number of entries: 10000, single insertion, total time consumption: 60830 milliseconds

    The effect of execution in Oracle is basically the same as that in MySQL, and the efficiency of batch processing operations is obviously high. Executed in a single line. The problem is that there is no such insert xx values(),()... syntax in Oracle, so how does it achieve batch execution?

    View the audit view dba_audit_trail

    when executing batchCnt=50

    从审计的结果中可以看到,batchCnt=50的时候,审计记录只有200条(扣除登入和登出),也就是sql只执行了200次。sql_text没有发生改写,仍然是"insert into t values (:1 , :2 , :3 , :4 , :5 )",而且sql_bind只记录了批量执行的最后一个参数,即50的倍数。根据awr报告可以看出,实际只执行了200次(由于篇幅限制,省略了awr截图)。那么oracle是怎么做到只执行200次但插入1万条记录的呢?我们来看看oracle中使用存储过程的批量插入。

    四、存储过程

    准备数据:

    首先将t表清空 truncate table t;

    用java往t表灌10万数据 exec("oracle", 100000, 1000);

    创建t1表 create table t1 as select * from t where 1 = 0;

    以下两个过程的意图一致,均为将t表中的数据导入t1表。nobatch是单次执行,usebatch是批量执行。

    create or replace procedure nobatch is
    begin
      for x in (select * from t)
      loop
        insert into t1 (id, name1, name2, name3, name4)
        values (x.id, x.name1, x.name2, x.name3, x.name4);
      end loop;
      commit;
    end nobatch;
    /
    create or replace procedure usebatch (p_array_size in pls_integer)
    is
      type array is table of t%rowtype;
      l_data array;
      cursor c is select * from t;
    begin
      open c;
      loop
        fetch c bulk collect into l_data limit p_array_size;
        forall i in 1..l_data.count insert into t1 values l_data(i);
        exit when c%notfound;
      end loop;
      commit;
      close c;
    end usebatch;
    /

    执行上述存储过程

    SQL> exec nobatch;  
    Elapsed: 00:00:32.92

    SQL> exec usebatch(50);
    Elapsed: 00:00:00.77

    SQL> exec usebatch(100);
    Elapsed: 00:00:00.47

    SQL> exec usebatch(1000);
    Elapsed: 00:00:00.19

    SQL> exec usebatch(100000);
    Elapsed: 00:00:00.26

    存储过程批量执行效率也远远高于单条执行。查看usebatch(50)执行时的审计日志,sql_bind也只记录了批量执行的最后一个参数,即50的倍数。与使用executeBatch方法在记录内容方面相同。因此可以推断,JDBC的executeBatch和存储过程的批量执行都采用了相同的方法

    存储过程的这个关键点就是forall。查阅相关文档。

    The FORALL statement runs one DML statement multiple times, with different values in the VALUES and WHERE clauses.
    The different values come from existing, populated collections or host arrays. The FORALL statement is usually much faster than an equivalent FOR LOOP statement.
    The FORALL syntax allows us to bind the contents of a collection to a single DML statement, allowing the DML to be run for each row in the collection without requiring a context switch each time.

    翻译过来就是forall很快,原因就是不需要每次执行的时候等待参数。

    The above is the detailed content of How to implement JDBC batch insert in Java. For more information, please follow other related articles on the PHP Chinese website!

    Statement
    This article is reproduced at:亿速云. If there is any infringement, please contact admin@php.cn delete
    带你搞懂Java结构化数据处理开源库SPL带你搞懂Java结构化数据处理开源库SPLMay 24, 2022 pm 01:34 PM

    本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于结构化数据处理开源库SPL的相关问题,下面就一起来看一下java下理想的结构化数据处理类库,希望对大家有帮助。

    Java集合框架之PriorityQueue优先级队列Java集合框架之PriorityQueue优先级队列Jun 09, 2022 am 11:47 AM

    本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于PriorityQueue优先级队列的相关知识,Java集合框架中提供了PriorityQueue和PriorityBlockingQueue两种类型的优先级队列,PriorityQueue是线程不安全的,PriorityBlockingQueue是线程安全的,下面一起来看一下,希望对大家有帮助。

    完全掌握Java锁(图文解析)完全掌握Java锁(图文解析)Jun 14, 2022 am 11:47 AM

    本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于java锁的相关问题,包括了独占锁、悲观锁、乐观锁、共享锁等等内容,下面一起来看一下,希望对大家有帮助。

    一起聊聊Java多线程之线程安全问题一起聊聊Java多线程之线程安全问题Apr 21, 2022 pm 06:17 PM

    本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于多线程的相关问题,包括了线程安装、线程加锁与线程不安全的原因、线程安全的标准类等等内容,希望对大家有帮助。

    详细解析Java的this和super关键字详细解析Java的this和super关键字Apr 30, 2022 am 09:00 AM

    本篇文章给大家带来了关于Java的相关知识,其中主要介绍了关于关键字中this和super的相关问题,以及他们的一些区别,下面一起来看一下,希望对大家有帮助。

    Java基础归纳之枚举Java基础归纳之枚举May 26, 2022 am 11:50 AM

    本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于枚举的相关问题,包括了枚举的基本操作、集合类对枚举的支持等等内容,下面一起来看一下,希望对大家有帮助。

    java中封装是什么java中封装是什么May 16, 2019 pm 06:08 PM

    封装是一种信息隐藏技术,是指一种将抽象性函式接口的实现细节部分包装、隐藏起来的方法;封装可以被认为是一个保护屏障,防止指定类的代码和数据被外部类定义的代码随机访问。封装可以通过关键字private,protected和public实现。

    归纳整理JAVA装饰器模式(实例详解)归纳整理JAVA装饰器模式(实例详解)May 05, 2022 pm 06:48 PM

    本篇文章给大家带来了关于java的相关知识,其中主要介绍了关于设计模式的相关问题,主要将装饰器模式的相关内容,指在不改变现有对象结构的情况下,动态地给该对象增加一些职责的模式,希望对大家有帮助。

    See all articles

    Hot AI Tools

    Undresser.AI Undress

    Undresser.AI Undress

    AI-powered app for creating realistic nude photos

    AI Clothes Remover

    AI Clothes Remover

    Online AI tool for removing clothes from photos.

    Undress AI Tool

    Undress AI Tool

    Undress images for free

    Clothoff.io

    Clothoff.io

    AI clothes remover

    AI Hentai Generator

    AI Hentai Generator

    Generate AI Hentai for free.

    Hot Article

    R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
    2 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
    Repo: How To Revive Teammates
    1 months agoBy尊渡假赌尊渡假赌尊渡假赌
    Hello Kitty Island Adventure: How To Get Giant Seeds
    4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

    Hot Tools

    mPDF

    mPDF

    mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

    Safe Exam Browser

    Safe Exam Browser

    Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

    MinGW - Minimalist GNU for Windows

    MinGW - Minimalist GNU for Windows

    This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

    Notepad++7.3.1

    Notepad++7.3.1

    Easy-to-use and free code editor

    SublimeText3 Linux new version

    SublimeText3 Linux new version

    SublimeText3 Linux latest version