Home  >  Article  >  Backend Development  >  What is the storage form of character data in memory?

What is the storage form of character data in memory?

青灯夜游
青灯夜游Original
2021-06-03 11:57:2725723browse

The storage form of character data in memory is ASCII code. Character data is to put a character constant into a character variable. It does not put the character itself into the memory unit, but puts the corresponding ASCII code of the character into the storage unit.

What is the storage form of character data in memory?

The operating environment of this tutorial: windows7 system, c99 version, Dell G3 computer.

Character data is stored in memory with its ASCII code value, which is a byte. All data types are stored in binary code with 0 and 1 in the memory. This principle will not change.

In C language, char type data is to put a character constant into a character variable. It is not to put the character itself into the memory unit, but to put the corresponding ASCII code of the character. into the storage unit.

In encoding, one Chinese character storage requires 2 bytes. In UTF-8 encoding, the storage of an English alphabetic character requires 1 byte, and the storage of a Chinese character requires 3 to 4 bytes. In UTF-16 encoding, the storage of one English alphabetic character or one Chinese character requires 2 bytes. In UTF-32 encoding, the storage of any character in the world requires 4 bytes.

What is the storage form of character data in memory?

Extended information:

ASCII ((American Standard Code for Information Interchange): American Standard Code for Information Interchange) is a set of computer codes based on the Latin alphabet System primarily used to display modern English and other Western European languages. It is the most common information exchange standard and is equivalent to the international standard ISO/IEC 646. The first time ASCII was published as a standardized type was in 1967, and the last update was in 1986. So far, a total of 128 characters have been defined.

ASCII code uses the specified 7-bit or 8-bit binary Arrays are combined to represent 128 or 256 possible characters. Standard ASCII code, also called basic ASCII code, uses 7 binary digits (the remaining 1 binary digit is 0) to represent all uppercase and lowercase letters, numbers 0 to 9, punctuation marks, and special controls used in American English. Character [1] . Among them:

0~31 and 127 (33 in total) are control characters or special communication characters (the rest are displayable characters), such as control characters: LF (line feed), CR (carriage return), FF ( Page feed), DEL (delete), BS (backspace), BEL (ring), etc.; communication special characters: SOH (head of text), EOT (end of text), ACK (confirmation), etc.; ASCII values ​​are 8, 9 , 10 and 13 are converted to backspace, tab, line feed and carriage return characters respectively. They do not have a specific graphic display, but will have different effects on text display depending on different applications [1] .

32~126 (95 in total) are characters (32 is a space), of which 48~57 are ten Arabic numerals from 0 to 9.

65~90 are 26 uppercase English letters, 97~122 are 26 lowercase English letters, and the rest are some punctuation marks, arithmetic symbols, etc.

Also note that in standard ASCII, its highest bit (b7) is used as a parity bit. The so-called parity check refers to a method used to check whether errors occur during code transmission. It is generally divided into two types: odd check and even check. Odd parity rules: the number of 1's in a byte of the correct code must be an odd number. If it is not an odd number, add 1 to the highest bit b7; even parity rules: the number of 1's in a byte of the correct code must be an even number. , if it is not an even number, add 1 [1] to the highest bit b7.

The last 128 are called extended ASCII codes. Many x86-based systems support the use of extended (or "high") ASCII. Extended ASCII allows the 8th bit of each character to be used to determine an additional 128 special symbol characters, foreign letters, and graphic symbols.

Related recommendations: "C Language Video Tutorial"

The above is the detailed content of What is the storage form of character data in memory?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn