Byte (Byte) is a unit of measurement used by computer information technology to measure storage capacity. It is a binary string of numbers processed as a unit and is a small unit that constitutes information. . The most commonly used byte is an octet, that is, it contains an eight-bit binary number.
Different encoding methods occupy different bytes for one character:
ASCII code:
One English letter (regardless of uppercase and lowercase) occupies one byte of space, one Chinese letter Chinese characters occupy two bytes of space. A sequence of binary numbers, used as a digital unit in the computer, is generally an 8-bit binary number, converted to decimal. The minimum value is 0 and the maximum value is 255. For example, an ASCII code is a byte.
UTF-8 encoding:
One English character is equal to one byte, and one Chinese character (including traditional Chinese) is equal to three bytes.
Unicode encoding:
One English word is equal to two bytes, and one Chinese character (including traditional Chinese) is equal to two bytes.
Symbols:
English punctuation occupies one byte, Chinese punctuation occupies two bytes. For example: the English period "." occupies 1 byte, and the Chinese period "." occupies 2 bytes.
Summary:
When encoding ASCII and Unicode codes, 8-bit binary represents an English character, and 16-bit binary represents a noon character. In UTF-8 encoding, 8-bit binary represents an English character, and 24-bit binary represents a Chinese character.
The above is the detailed content of How many binary digits represent a character?. For more information, please follow other related articles on the PHP Chinese website!