Home >Java >JavaBase >Solution to garbled characters in java reading txt files

Solution to garbled characters in java reading txt files

尚
Original
2019-11-30 09:32:086998browse

Solution to garbled characters in java reading txt files

Java reads a txt file. If the encoding format does not match, garbled characters will appear. Therefore, when reading txt files, you need to set the reading encoding. The encoding format of txt documents is written in the file header. In the program, the encoding format of the file needs to be parsed first. After obtaining the encoding format, reading the file in this format will not produce garbled characters. (Recommended: java video tutorial)

Correspondence between java encoding and txt encoding:

Solution to garbled characters in java reading txt files

Example:

package com.lfl.attachment;  
  
import java.io.BufferedReader;  
import java.io.File;  
import java.io.FileInputStream;  
import java.io.InputStream;  
import java.io.InputStreamReader;  
  
public class TextMain {  
  
    public static void main(String[] args) throws Exception {  
        String filePath = "D:/article.txt";  
//      String filePath = "D:/article333.txt";    
//      String filePath = "D:/article111.txt";    
        String content = readTxt(filePath);  
        System.out.println(content);  
          
    }  
  
      
      
    /** 
     * 解析普通文本文件  流式文件 如txt 
     * @param path 
     * @return 
     */  
    @SuppressWarnings("unused")  
    public static String readTxt(String path){  
        StringBuilder content = new StringBuilder("");  
        try {  
            String code = resolveCode(path);  
            File file = new File(path);  
            InputStream is = new FileInputStream(file);  
            InputStreamReader isr = new InputStreamReader(is, code);  
            BufferedReader br = new BufferedReader(isr);  
//          char[] buf = new char[1024];  
//          int i = br.read(buf);  
//          String s= new String(buf);  
//          System.out.println(s);  
            String str = "";  
            while (null != (str = br.readLine())) {  
                content.append(str);  
            }  
            br.close();  
        } catch (Exception e) {  
            e.printStackTrace();  
            System.err.println("读取文件:" + path + "失败!");  
        }  
        return content.toString();  
    }  
      
      
      
    public static String resolveCode(String path) throws Exception {  
//      String filePath = "D:/article.txt"; //[-76, -85, -71]  ANSI  
//      String filePath = "D:/article111.txt";  //[-2, -1, 79] unicode big endian  
//      String filePath = "D:/article222.txt";  //[-1, -2, 32]  unicode  
//      String filePath = "D:/article333.txt";  //[-17, -69, -65] UTF-8  
        InputStream inputStream = new FileInputStream(path);    
        byte[] head = new byte[3];    
        inputStream.read(head);      
        String code = "gb2312";  //或GBK  
        if (head[0] == -1 && head[1] == -2 )    
            code = "UTF-16";    
        else if (head[0] == -2 && head[1] == -1 )    
            code = "Unicode";    
        else if(head[0]==-17 && head[1]==-69 && head[2] ==-65)    
            code = "UTF-8";    
            
        inputStream.close();  
          
        System.out.println(code);   
        return code;  
    }  
      
}

Note: In the resolveTxt method, the InputStream stream cannot be passed through the readTxt method. This will cause the two methods to hold the same stream reference, and in the resolveTxt method, three bytes in the stream have been read, and the pos in the stream has already been read. It is 3, not the starting position of the stream. When reading in readTxt, an IOException: Read Error will occur.

For more java knowledge, please pay attention to the java basic tutorial column.

The above is the detailed content of Solution to garbled characters in java reading txt files. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn