Java reads a txt file. If the encoding format does not match, garbled characters will appear. Therefore, when reading txt files, you need to set the reading encoding. The encoding format of txt documents is written in the file header. In the program, the encoding format of the file needs to be parsed first. After obtaining the encoding format, reading the file in this format will not produce garbled characters. (Recommended: java video tutorial)
Correspondence between java encoding and txt encoding:
Example:
package com.lfl.attachment; import java.io.BufferedReader; import java.io.File; import java.io.FileInputStream; import java.io.InputStream; import java.io.InputStreamReader; public class TextMain { public static void main(String[] args) throws Exception { String filePath = "D:/article.txt"; // String filePath = "D:/article333.txt"; // String filePath = "D:/article111.txt"; String content = readTxt(filePath); System.out.println(content); } /** * 解析普通文本文件 流式文件 如txt * @param path * @return */ @SuppressWarnings("unused") public static String readTxt(String path){ StringBuilder content = new StringBuilder(""); try { String code = resolveCode(path); File file = new File(path); InputStream is = new FileInputStream(file); InputStreamReader isr = new InputStreamReader(is, code); BufferedReader br = new BufferedReader(isr); // char[] buf = new char[1024]; // int i = br.read(buf); // String s= new String(buf); // System.out.println(s); String str = ""; while (null != (str = br.readLine())) { content.append(str); } br.close(); } catch (Exception e) { e.printStackTrace(); System.err.println("读取文件:" + path + "失败!"); } return content.toString(); } public static String resolveCode(String path) throws Exception { // String filePath = "D:/article.txt"; //[-76, -85, -71] ANSI // String filePath = "D:/article111.txt"; //[-2, -1, 79] unicode big endian // String filePath = "D:/article222.txt"; //[-1, -2, 32] unicode // String filePath = "D:/article333.txt"; //[-17, -69, -65] UTF-8 InputStream inputStream = new FileInputStream(path); byte[] head = new byte[3]; inputStream.read(head); String code = "gb2312"; //或GBK if (head[0] == -1 && head[1] == -2 ) code = "UTF-16"; else if (head[0] == -2 && head[1] == -1 ) code = "Unicode"; else if(head[0]==-17 && head[1]==-69 && head[2] ==-65) code = "UTF-8"; inputStream.close(); System.out.println(code); return code; } }
Note: In the resolveTxt method, the InputStream stream cannot be passed through the readTxt method. This will cause the two methods to hold the same stream reference, and in the resolveTxt method, three bytes in the stream have been read, and the pos in the stream has already been read. It is 3, not the starting position of the stream. When reading in readTxt, an IOException: Read Error will occur.
For more java knowledge, please pay attention to the java basic tutorial column.
The above is the detailed content of Solution to garbled characters in java reading txt files. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 Linux new version
SublimeText3 Linux latest version

Dreamweaver Mac version
Visual web development tools

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

SublimeText3 Mac version
God-level code editing software (SublimeText3)