Home  >  Article  >  Java  >  What is the difference between java characters and bytes

What is the difference between java characters and bytes

爱喝马黛茶的安东尼
爱喝马黛茶的安东尼Original
2019-11-12 15:24:473381browse

What is the difference between java characters and bytes

Byte means byte, which is the basic data type in Java. It is used to declare byte-type variables. A byte contains 8 bits. Therefore, the byte type is The value range is -128 to 127.

Usually when reading non-text files (such as pictures, sounds, executable files), you need to use byte arrays to save the contents of the files. When downloading files, you also use byte arrays as temporary buffer reception. document content. Therefore, byte is essential in file operations. It is used whether writing or reading files.

In some programs (especially those related to hardware), certain data will be stored in byte type variables, such as 00110010, where each bit represents a parameter, and then Perform value acquisition and assignment operations on parameters using bit operations.

The machine only knows bytes, but characters are semantic units. They are encoded. A character may be encoded into 1, 2, or even 3, 4 bytes. This is related to the character set encoding. English letters and numbers are single-byte, but characters in natural languages ​​such as Chinese characters are multi-byte. One byte can only represent 255 characters, and it cannot be used to process so many natural languages ​​around the world, so multi-byte storage is definitely needed.

So in the input and output of files, InputStream and OutputStream deal with byte streams, which means that everything is assumed to be binary bytes; while Reader and Writer are character streams, which It involves character set issues; according to the ANSI coding standard, punctuation marks, numbers, and uppercase and lowercase letters all occupy one byte, and Chinese characters occupy 2 bytes. According to the UNICODE standard, all characters occupy 2 bytes.

Bytes:

1, bit=1, binary data 0 or 1.

2. byte=8bit, 1 byte equals 8 bits. The basic unit of measurement for storage space.

3. One English letter = 1byte = 8bit. One English letter is 1 byte, which is 8 bits.

4. One Chinese character = 2byte = 16bit. One Chinese character is two bytes, which is 16 bits.

Characters:

Java uses unicode to represent characters. A char in java is 2 bytes. The unicode encoding of a Chinese or English character takes up 2 bytes, but the number of bytes occupied by a character varies with other encodings.

In GB 2312 encoding or GBK encoding, the storage of an English alphabetic character requires 1 byte, and the storage of a Chinese character requires 2 bytes.

In UTF-8 encoding, the storage of an English alphabetic character requires 1 byte, and the storage of a Chinese character requires 3 to 4 bytes.

In UTF-16 encoding, the storage of an English alphabetic character requires 2 bytes, and the storage of a Chinese character requires 3 to 4 bytes (some Chinese characters in the Unicode extension area require 4 characters to store Festival).

In UTF-32 encoding, the storage of any character in the world requires 4 bytes.

php Chinese website, a large number of free Java introductory tutorials, welcome to learn online!

The above is the detailed content of What is the difference between java characters and bytes. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn