Home  >  Q&A  >  body text

java如何高效读写10G以上大文件

有一份10G以上大文本文件,需要替换里面的一些文本信息(每一行都有),如何高效读并替换掉生成新的文件

黄舟黄舟2765 days ago583

reply all(5)I'll reply

  • 伊谢尔伦

    伊谢尔伦2017-04-18 10:54:01

    1. Split into multiple files first

    2. Multiple threads operate multiple files to avoid two threads operating the same file

    3. Read files line by line and write new files line by line

    4. Merge all files

    1,4 Just use linux commands~

    reply
    0
  • 怪我咯

    怪我咯2017-04-18 10:54:01

    File file = new File(filepath);
    BufferedInputStream fis = new BufferedInputStream(new FileInputStream(file));
    BufferedReader reader = new BufferedReader(new InputStreamReader(fis,"utf-8"),510241024);
    String line = "";
    while((line = reader.readLine()) != null){

    //进行替换操作和其他业务    

    }

    reply
    0
  • 迷茫

    迷茫2017-04-18 10:54:01

    In order to improve performance, you may need mapped IO. For details, please refer to:

    1. Why use Memory Mapped File or MappedByteBuffer in Java

    2. java large file read and write operations, java nio's MappedByteBuffer, efficient file/memory mapping

    3. A simple comparison of the performance of java.io and java.nio

    reply
    0
  • 天蓬老师

    天蓬老师2017-04-18 10:54:01

    If it is a simple text replacement, just use the sed command of Linux.

    If it is a more complex text replacement, see below:

    1. http://stackoverflow.com/ques...

    2. http://www.baeldung.com/java-...

    reply
    0
  • 怪我咯

    怪我咯2017-04-18 10:54:01

    用spark分析、
    lines=sc.textFile("your_file");
    filterlines=lines.filter(your_filter_function)
    filterlines.xxx()

    reply
    0
  • Cancelreply