Home  >  Article  >  How many bytes does an English letter occupy in an ascii code file?

How many bytes does an English letter occupy in an ascii code file?

藏色散人
藏色散人Original
2021-11-22 15:07:0619503browse

In the ascii code file, an English letter occupies one byte, and a Chinese character occupies two bytes of space; the ascii code uses a specified 7-bit or 8-bit binary number combination to represent 128 or 256 possibilities characters, and byte is the unit of binary data. A byte is usually 8 bits long.

How many bytes does an English letter occupy in an ascii code file?

#The operating environment of this article: Windows 7 system, Dell G3 computer.

How many bytes does one English letter occupy in the ASCII code file?

In the ASCII code, one English letter (regardless of case) occupies one byte space, one Chinese character occupies two bytes of space.

ASCII:

ASCII ((American Standard Code for Information Interchange): American Standard Code for Information Interchange) is a computer coding system based on the Latin alphabet, mainly used Displays modern English and other Western European languages. It is the most common information exchange standard and is equivalent to the international standard ISO/IEC 646. ASCII was first published as a standardized type in 1967, and was last updated in 1986. So far, a total of 128 characters have been defined.

ASCII code uses a specified 7-bit or 8-bit binary number combination to represent 128 or 256 possible characters. Standard ASCII code, also called basic ASCII code, uses 7 binary digits (the remaining 1 binary digit is 0) to represent all uppercase and lowercase letters, numbers 0 to 9, punctuation marks, and special controls used in American English. Character[1] . Among them:

0~31 and 127 (33 in total) are control characters or special communication characters (the rest are displayable characters), such as control characters: LF (line feed), CR (carriage return), FF ( Page feed), DEL (delete), BS (backspace), BEL (ring), etc.; communication special characters: SOH (head of text), EOT (end of text), ACK (confirmation), etc.; ASCII values ​​are 8, 9 , 10 and 13 are converted to backspace, tab, line feed and carriage return characters respectively. They do not have a specific graphic display, but will have different effects on text display depending on different applications [1] .

32~126 (95 in total) are characters (32 is a space), of which 48~57 are ten Arabic numerals from 0 to 9.

65~90 are 26 uppercase English letters, 97~122 are 26 lowercase English letters, and the rest are some punctuation marks, arithmetic symbols, etc.

Also note that in standard ASCII, its highest bit (b7) is used as a parity bit. The so-called parity check refers to a method used to check whether errors occur during code transmission. It is generally divided into two types: odd check and even check. Odd parity rules: the number of 1's in a byte of the correct code must be an odd number. If it is not an odd number, add 1 to the highest bit b7; even parity rules: the number of 1's in a byte of the correct code must be an even number. , if it is not an even number, add 1 to the highest bit b7.

The last 128 are called extended ASCII codes. Many x86-based systems support the use of extended (or "high") ASCII. Extended ASCII allows the 8th bit of each character to be used to determine an additional 128 special symbol characters, foreign letters, and graphic symbols.

Bytes:

Bytes are the unit of binary data. A byte is usually 8 bits long. However, some older computer architectures use different lengths. To avoid confusion, in most international literature the word byte is used instead of byte. In most computer systems, a byte is an 8-bit unit of data. Most computers use a byte to represent a character, number, or other character. A byte can also represent a series of binary bits. In some computer systems, 4 bytes represent a word, which is the unit of data that the computer can efficiently process when executing instructions. Some language descriptions require 2 bytes to represent a character, which is called a double-byte character set. Some processors are capable of handling double-byte or single-byte instructions. Bytes are often abbreviated as "B" and bits are usually abbreviated as lowercase "b". The size of computer memory is usually expressed in bytes.

For more related knowledge, please visit the FAQ column!

The above is the detailed content of How many bytes does an English letter occupy in an ascii code file?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn