Home >Common Problem >How many bytes does one ascii character occupy?

How many bytes does one ascii character occupy?

青灯夜游
青灯夜游Original
2023-03-09 15:49:0524069browse

One ascii character occupies 1 byte. ASCII code characters are represented by 7-bit or 8-bit binary encoding in the computer and are stored in one byte, that is, one ASCII code occupies one byte. ASCII code can be divided into standard ASCII code and extended ASCII code. Standard ASCII code is also called basic ASCII code. It uses 7-bit binary numbers (the remaining 1 binary digit is 0) to represent all uppercase and lowercase letters, and the numbers 0 to 9. Punctuation marks, and special control characters used in American English.

How many bytes does one ascii character occupy?

The operating environment of this tutorial: Windows 7 system, Dell G3 computer.

ASCII (American Standard Code for Information Interchange): The American Standard Code for Information Interchange is a computer coding system based on the Latin alphabet, mainly used to display modern English and other Western European languages.

ASCII code uses a specified 7-bit or 8-bit binary number combination to represent 128 or 256 possible characters.

ASCII code characters are represented by 7-bit or 8-bit binary encoding in the computer and are stored in one byte, that is, one ASCII code occupies one byte.

How many bytes does one ascii character occupy?

ASCII code can be divided into standard ASCII code and extended ASCII code.

Standard ASCII code is also called basic ASCII code, which uses 7 binary digits (the remaining 1 binary digit is 0) to represent all uppercase and lowercase letters, the number 0 to 9, punctuation, and special control characters used in American English. Among them:

  • 0~31 and 127 (33 in total) are control characters or special communication characters (the rest are displayable characters)

    For example, the control character: LF ( Line feed), CR (carriage return), FF (page feed), DEL (delete), BS (backspace), BEL (ring), etc.;

    Special characters for communication: SOH (header), EOT (End of text), ACK (confirmation), etc.;

    ASCII values ​​8, 9, 10 and 13 are converted into backspace, tab, line feed and carriage return characters respectively. They do not have a specific graphic display, but will have different effects on text display depending on the application.

  • 32~126 (95 in total) are characters (32 is a space), of which 48~57 are ten Arabic numerals from 0 to 9.

  • 65~90 are 26 uppercase English letters, 97~122 are 26 lowercase English letters, and the rest are some punctuation marks, arithmetic symbols, etc.

Also note that in standard ASCII, its highest bit (b7) is used as a parity bit. The so-called parity check refers to a method used to check whether errors occur during code transmission. It is generally divided into two types: odd check and even check. Odd parity rules: the number of 1's in a byte of the correct code must be an odd number. If it is not an odd number, add 1 to the highest bit b7; even parity rules: the number of 1's in a byte of the correct code must be an even number. , if it is not an even number, add 1 to the highest bit b7.

The last 128 characters are called extended ASCII codes. Many x86-based systems support the use of extended (or "high") ASCII. Extended ASCII allows the 8th bit of each character to be used to determine an additional 128 special symbol characters, foreign letters, and graphic symbols.

The ASCII code standard table is as follows

##STX (start of text)Text begins##0000 00110000 01000000 01010000 01100000 0111##BEL (bell)ring010011012013##160x100001 00010001 0010##0x13DC3 (device control 3)Device control 3 024##CAN (cancel)Cancel 0001 1001031250x19EM (end of medium)End of medium0001 1010032260x1A SUB (substitute) instead of0001 101103327##0001 11000001 1101##GS (group separator) Grouping symbol 036300x1ERS (record separator)Record separator037310x1FUS (unit separator)Unit separator##040##0010 000104134##0010 0011043350x230010 0100044360x240010 01010010 01100010 0111##0010 10000010 1001##0x2A* 星053430x2B##0540010 1101##0x2D-##.Period0010 1111057470x2F/slash0011 0000060480x300Characters 0##0011 0001##51 0x333##0011 0100064##5##066540x388Character 8##610x3D=equal sign0011 11100760011 1111##?Question mark0100640x40@Email symbol010165##0100 00100102##67 0x43Cuppercase letter C##Euppercase letter E0100 01100106700x46FCapital letter F 0100 0111010771##0100 1000011173 0x49Iuppercase letter I01120113760x4C770100 1110011678##uppercase letters O0101 00000120800x50P0101 00010101 0010##0x52RCapital R##0123##0124##850x55Uupper case U8687##88 0101 10010131##0x59YCapital letter Y##910x5B[0101 1100##0136940x5E^013797##98##0x630x64##0110 011101471030x67glower case g0110 100001501040x68hlowercase h##0110 1001 01531070x6Bk01540110 11010155##n1110111 00000160##0x71qlower case q0111 00100162114##01631150x73slowercase s01641160x74tlower case t0165##0x78xLower case x121##0111 1010##0111 11000174##0177##Size rules
ASCII table
Bin
(binary)
Oct
(octal)
Dec
(decimal)
Hex
(Hex)
Abbreviation/Character
Explanation
##0000 0000
00
0
0x00
NUL(null)
null character
0000 0001
01
1
0x01
SOH(start of headline)
Title start
0000 0010
02
2
##0x02
03
3
0x03
ETX (end of text)
End of text
04
4
0x04
EOT (end of transmission)
End of transmission
05
5
##0x05
ENQ (enquiry)
Request
06
6
0x06
ACK (acknowledge)
Notification received
07
7
##0x07
##0000 1000
8
0x08
BS (backspace)
Backspace
0000 1001
9
0x09
HT (horizontal tab)
Horizontal tab character
0000 1010
10
0x0A
LF (NL line feed, new line)
Line feed key
0000 1011
11
##0x0B
VT (vertical tab)
Vertical tab character
##0000 1100
014
12
0x0C
FF (NP form feed, new page )
Page key
##0000 1101
015
13
##0x0D
CR ( carriage return)
Enter key
0000 1110
016
14
0x0E
SO (shift out)
No need to switch
0000 1111
017
15
0x0F
SI (shift in)
Enable switching
0001 0000
020
##DLE (data link escape)
Data link escape
021
17
0x11
DC1 (device control 1)
Device control 1
##022
##18
0x12
DC2 (device control 2)
Device control 2
0001 0011
023
19
##0001 0100
20
0x14
##DC4 (device control 4)
Device Control 4
##0001 0101
025
21
0x15
NAK (negative acknowledge)
Refuse to accept
0001 0110
026
22
0x16
SYN (synchronous idle)
synchronous idle
0001 0111
027
23
0x17
ETB (end of trans. block)
End transmission block
0001 1000
##030
24
0x18
##0x1B
ESC (escape)
Escape (overflow)
034
28
0x1C
FS (file separator)
File delimiter
##035
29
0x1D
0001 1110
0001 1111
0010 0000
32
0x20
(space)
space
33
##0x21
!
Exclamation mark
##0010 0010
##042
##0x22
"
##Double quotes
##$
dollar sign
045
37
0x25
%
Percent sign
046
38
0x26
&
047
39
0x27
'
Close single quotes
050
40
0x28
(
open bracket
051
41
0x29
)
Closing bracket
##0010 1010
052
42
0010 1011
##plus sign
0010 1100
44
0x2C
,
comma
##055
45
Minus sign/dash
0010 1110
056
46
##0x2E
061
49
##0x31
1
Character 1
##0011 0010
062
50
0x32
2
Character 2
0011 0011
063
##Character 3
52
##0x34
4
Characters 4
0011 0101
065
53
0x35
##Character 5
0011 0110
0x36
6
##Characters 6
0011 0111
067
55
0x37
7
Characters 7
##0011 1000
070
56
0011 1001
071
57
0x39
9
Characters 9
0011 1010
072
58
0x3A
:
Colon
##0011 1011
073
59
0x3B
;
Semicolon
0011 1100
074
60
0x3C
is less than
0011 1101
075
##62
##0x3E
>
is greater than
##077
63
0x3F
0100 0000
0100 0001
##0x41
A
Capital Letter A
66
0x42
B
Capital B
##0100 0011
0103
0100 0100
0104
68
##0x44
D
Capital D
0100 0101
0105
69
0x45
##0x47
G
Capital G
##0110
72
0x48
H
uppercaseH
##0100 1001
01001010
##74
0x4A
J
Capital J
0100 1011
##75
##0x4B
K
Capital K
0100 1100
##0114
##L
uppercase L
0100 1101
0115
0x4D
M
##Capital letter M
0x4E
N
uppercase letter N
##0100 1111
0117
79
0x4F
##O
##upper case P
0121
##81
0x51
Q
Capital letter Q
0122
##82
0101 0011
83
0x53
S
uppercase S
0101 0100
84
0x54
T
##Capital letter T
0101 0101
0125
##0101 0110
0126
0x56
V
Capital V
0101 0111
0127
0x57
W
uppercase letter W
0101 1000
0130
0x58
X
Capital letter X
##89
##0101 1010
0132
90
0x5A
Z
##Capital letter Z
0101 1011
0133
##Open square brackets
0134
##92
##0x5C
\
Backslash
0101 1101
0135
93
0x5D
]
Closing square bracket
0101 1110
##Caret
0101 1111
95
##0x5F
_
underscore
##0110 0000
0140
96
0x60
`
Open single quotes
##0110 0001
##0141
0x61
a
lower case a
0110 0010
0142
0x62
##b
lowercase b
##0110 0011
0143
99
c
lower case c
0110 0100
0144
100
##d
##lower case d
0110 0101
0145
101
0x65
e
lower case e
##0110 0110
0146
102
0x66
f
lower case f
0151
105
0x69
i
##lower case i
##0110 1010
0152
106
0x6A
j
Lower case j
##0110 1011
##lower case k
0110 1100
108
##0x6C
l
lowercase l
109
##0x6D
m
lower case m
0110 1110
0156
110
0x6E
##lowercase n
0110 1111
0157
0x6F
o
##lowercase o
##112
0x70
p
lower case p
0111 0001
0161
##113
##0x72
##r
##lowercase r
0111 0011
0111 0100
##0111 0101
117
##0x75
##u
lowercase u
0111 0110
0166
118
0x76
v
lower case v
0111 0111
0167
119
0x77
w
lowercase w
0111 1000
0170
120
##0111 1001
0171
0x79
y
Lower case y
##0172
122
0x7A
z
lower case z
0111 1011
0173
123
0x7B
{
##Opening brackets
##124
##0x7C
|
vertical line
##0111 1101
0175
125
0x7D
}
Closing curly brace
0111 1110
0176
126
0x7E
~
tilde
0111 1111
##127
##0x7F
DEL (delete)
Delete
Common ASCII code size rules: numbers
Numbers are smaller than letters. For example, "7"The number 0 is smaller than the number 9, and increases in sequence from 0 to 9. For example, "3"
  • The letter A is smaller than the letter Z, and increases in order from A to Z. For example, "A"
  • The uppercase letters of the same letter are 32 smaller than the lowercase letters. Such as "A"
  • The ASCII code sizes of several common letters: "A" is 65; "a" is 97; "0" is 48.

    For more related knowledge, please visit the
  • FAQ
  • column!

The above is the detailed content of How many bytes does one ascii character occupy?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn