Java reads text files (such as csv files, txt files, etc.), and when encountering Chinese, it becomes garbled. (Recommendation: java video tutorial)
The reading code is as follows:
List<String> lines=new ArrayList<String>(); BufferedReader br = new BufferedReader(new FileReader(fileName)); String line = null; while ((line = br.readLine()) != null) { lines.add(line); } br.close();
The principle of java reading garbled files:
I of Java /O class processing is as shown in the figure:
Reader class is the parent class for reading characters in Java's I/O, and the InputStream class is the parent class for reading bytes. The InputStreamReader class It is the bridge that associates bytes to characters. It is responsible for processing the conversion of read bytes into characters during the I/O process. The specific decoding of bytes into characters is implemented by StreamDecoder. During the decoding process of StreamDecoder, it must be done by the user. Specify the Charset encoding format. It is worth noting that if you do not specify Charset, the default character set in the local environment will be used. For example, in the Chinese environment, GBK encoding will be used.
Summary: When Java reads the data stream, you must specify the encoding method of the data stream, otherwise the default character set in the local environment will be used.
After the above analysis, the modified code is as follows:
List<String> lines=new ArrayList<String>(); BufferedReader br=new BufferedReader(new InputStreamReader(new FileInputStream(fileName),"UTF-8")); String line = null; while ((line = br.readLine()) != null) { lines.add(line); } br.close();
For more java knowledge, please pay attention to the java basic tutorial column.
The above is the detailed content of Detailed graphic and text explanation of garbled file reading problem in Java. For more information, please follow other related articles on the PHP Chinese website!