Garbled characters appear when converting byte[] to String in java-JavaBase-php.cn

Home

Java

JavaBase

Garbled characters appear when converting byte[] to String in java

王林

Nov 27, 2019 am 09:28 AM

javastringGarbled characters

Garbled characters appear when converting byte[] to String in java

First of all, it is very simple to directly convert the commonly used byte[] to String:

public static void main(String[] args) {  
String str="我是中国人";  
byte[] arr=str.getBytes();  
System.out.println("打印："+arr);
for(byte e : arr) {
System.out.print(e + " ");
}
String str2=new String(arr);  
System.out.println("\n打印2："+str2);  
}

java related video recommendations: java learning

For example, the output result of the above is:

打印：[B@15db9742
-50 -46 -54 -57 -42 -48 -71 -6 -56 -53 
打印2：我是中国人

You will know the encoding when you see this. byte is one byte, and Chinese characters are two bytes. Therefore, five Chinese characters require ten byte types of digital storage. Then the numbers are turned into Chinese characters, and there is a process of coding standards.

So how does java handle character encoding?

JAVA uses its own String class, and String class objects do not need to specify a coding table! Why does it know what characters each of a bunch of numbers represents? This is because the character information in String is stored in UNICODE encoding. In order to represent characters (note that it is a single character), JAVA also has the data type char, and its size is a fixed length of two 8-digit hexadecimal digits, which is 0~65535. The purpose is to correspond to a character in UNICODE.

If you want to get a UNICODE number in a String, you can use getChars(int srcBegin, int srcEnd, char[] dst, int dstBegin) method to get a char[], this char[] represents String characters, numbers encoded according to the UNICODE encoding table.

Why is there garbled code when converting byte[ ] to String?

Obviously, as mentioned above, the coding standards are different. For example, the Chinese word "dang" in the GB2312 standard is represented by two eight-digit numbers 0xB5 and 0xB1. On the English system, there is no GB2312 encoding table. If you give it a 0xB5, 0xB1, it will be treated as ASCII. Put it in Java, and it will process it according to its own UNICODE specification, so if the specifications are different, strange results will appear, that is, garbled characters.

So how do we solve the garbled problem of converting byte[] to String?

It depends on where byte[] comes from. It is often encountered that a picture needs to be converted into byte[] and then converted into a String stream object for transmission to other places. The receiver then converts it into byte[] and then into a picture.

1. If byte[] is transmitted directly, data loss will occur if byte[] is too long. Because not all byte combinations can be mapped to char.

2. Use the common Base64 encoding specification. The encoding specification of base64 is to convert common characters into 6-bit binary representation (64 are commonly used, so it is called base64). How to write, there are ready-made tool classes as follows:

import org.apache.commons.codec.binary.Base64;  
public class UtilHelper {     
    //base64字符串转byte[]  
    public static byte[] base64String2ByteFun(String base64Str){  
        return Base64.decodeBase64(base64Str);  
    }  
    //byte[]转base64  
    public static String byte2Base64StringFun(byte[] b){  
        return Base64.encodeBase64String(b);  
    }  
}

This way, the standard conversion between byte[] and String can be guaranteed.

More related articles and tutorials are recommended: Java zero-based introduction

The above is the detailed content of Garbled characters appear when converting byte[] to String in java. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055612 fails to install in Windows 10?

4 weeks agoByDDD

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Nordhold: Fusion System, Explained

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),