Home >Web Front-end >JS Tutorial >Javascript core reading lexical structure_basic knowledge

Javascript core reading lexical structure_basic knowledge

WBOY
WBOYOriginal
2016-05-16 16:16:241195browse

The lexical structure of a programming language is a basic set of rules that describe how you write the language. As the basis of the syntax, it stipulates what variable names look like, how to write comments, and how to distinguish between statements. This section uses a very short space to introduce the lexical structure of JavaScript.

1. Character set

Javascript programs are written using the Unicode character set, which is a superset of ASCII and Latin-1 and supports almost all languages ​​in the region. ECMAscript3 requires that the implementation of javascript must support Unicode2,1 and subsequent versions, and ECMAscript5 requires support of Unicode3 and subsequent versions

i. Case sensitive

Javascript is a case-sensitive language, which means that keywords, variables, function names and all expression characters must be in consistent upper and lower case. For example, the keyword while must be written as while, not While or WHILE.

But it should be noted that html is not case-sensitive (although xhtml is). Since it is closely related to client-side javascript, it is easy to be confused. For example, in the processing event set in HTML, the onclick attribute can be written as onClick, but in JavaScript, onclick is written in lowercase.

ii Space, newline, and format controls

Javascript will ignore spaces between tokens in the program. In most cases, JavaScript will also ignore newlines. Since spaces and line breaks can be used freely in the code, neat and consistent indentation can be used to form a unified coding style and improve the readability of the code.

Javascript recognizes the space character (u0020) in addition to it. JavaScript also represents the following characters that mark spaces: horizontal tab (u0009), vertical tab (u000B), form feed (u000C), nonbreaking whitespace (u00A0), byte order mark (uFEFF), and in All characters of the Zs category in Unicode. JavaScript recognizes the following characters as terminators: line feed (u000A), carriage return (u000D), line separator (u2028), and paragraph separator (u2029). The carriage return character and the line feed character together are parsed into a single line terminator.

Unicode format control characters (Cf class), such as "right to left writing mark" (u200F) and "left to right writing mark" (u200E), control the visual display of text. This is crucial for the correct display of some non-English text. These characters can be used in JavaScript comments, string literals and regular expression literals, but not in identifiers (for example, variable names). , with the exception of zero-width connector (u200D) and zero-width non-connector (uFEFF), which are hand characters that can appear in identifiers but cannot be used as identifiers. As mentioned above, the byte order mark format control character (uFEFF) is treated as a space

iii.Unicode escape sequence

In some computer hardware and software, the full set of Unicode characters cannot be displayed or input. In order to support programmers who use older technologies, JavaScript defines a special sequence that uses 6 ASCII characters to represent any 16-bit Unicode internal code. These Unicode escape sequences are prefixed with u followed by the hexadecimal mouse (represented using numbers and upper and lower case letters A-F). This Unicode escape writing method can be used in JavaScript string literals, regular expressions, and identifiers (except keywords). For example, the Unicode escape writing method of character é is u00E9, and the following two Javascript strings are exactly the same.

"café" === "cafu00e9" => true
Unicode escape writing can appear in comments, but since JavaScript ignores comments, they are only treated as ascii characters in the context and will not be followed by the corresponding Unicode characters

iiii standardization

Unicode allows multiple methods to encode the same character. For example, the character é can use the Unicode character u00E9, or the ordinary ascii character e followed by an intonation mark u0301. In the text editor, the results displayed by these two encodings are exactly the same, but their binary encoding representations are different. The same is true in computers. The Unicode standard defines a preferred code format for index characters and provides a standardized processing method to convert text into a standard format suitable for comparison. It will not standardize other representations, strings, or regular expressions. .

2. Notes

Javascript supports two comment methods. The text after the end of the line "//" will be ignored by JavaScript as a comment.
In addition, the text between /* and */ will also be treated as comments. This kind of comment can be written across lines, but there cannot be nested comments.

//Single line comment
/*
*
*
*
*/
3. Direct measurement

The so-called literals are data values ​​used directly in the program. The literal quantities are listed below

Copy code The code is as follows:

12 //Number
1.2 //Decimal
"Hllo World" //String text
              'hi' //Another string
                 true //Boolean value
                  false //Boolean value
/ Javascript / gi // regular expressions direct quantity (matching mode match)
                                                                                                                                        null
Chapter 3 will explain numbers and string literals in detail. Regular expression literals will be explained in Chapter 10. More expressions can be written as array or object literals.

{x:1,y:2} //Object

                          [1,2,3,4,5] //Array

4. Identifiers and reserved words

An identifier is a name. In JavaScript, identifiers are used to name variables and functions, or to mark jump locations in certain loop statements in JavaScript code. JavaScript identifiers must start with letters. Start with an underscore or dollar sign. Subsequent characters can be letters. number. Underscore or dollar sign (numbers are not allowed to appear as the first letter, JavaScript can easily distinguish identifiers and numbers), the following are legal identifiers

Copy code The code is as follows:
my_variable_name
              b13
               _dummy
               $str

For reasons of portability and ease of writing, we usually only use ASCII letters and numbers to write identifiers. Then it should be noted that JavaScript allows letters and numbers in the Unicode character set to appear in identifiers (technically, ECMAScript allows the Mn class, Mc class and P class of the Unicode character mechanism to appear after the first character of the identifier), Therefore, programmers can use non-English languages ​​or mathematical notation to write identifiers

Copy code The code is as follows:
var sá = true;
          var π = 3.14;

Javascript takes out some identifiers and uses them as keywords, so you can no longer use these keywords as identifiers in the program.

Copy code The code is as follows:
break
case
catch
continue
default
delete
do
else
finally
for
function
if
in
instanceof
new
return
switch
this
throw
try
typeof
var
void
while
with

javascript reserved words

class const enum export
export extends import super
Also, these keywords are legal in normal javascript, but are reserved words in strict mode

implements let private public yield interface package
protected static
In the same strict mode, strict restrictions are placed on the following identifiers, but variable names, parameter names, and function names cannot be used.

arguments eval
The specific implementation of JavaScript may define unique global variables and functions. Each specific JavaScript running environment (client) server, etc. has its own global attribute list, which needs to be kept in mind. (window object to understand the global variables and function list defined in client javascript)

5. Optional semicolon

Like many programming languages, JavaScript uses semicolons (;) to separate statements. This is very important to enhance the readability and neatness of the code. Without a separator, the end of one statement becomes the beginning of the next statement, and vice versa.
In JavaScript, each statement is on its own line, and the semicolon between statements can usually be omitted (the semicolon before the "}" curly brace at the end of the program can also be omitted). Many JavaScript programmers (including the code examples in this book) use a semicolon to clearly mark the end of a statement, even when the semicolon is not exactly needed. Another style is to use a semicolon whenever the semicolon can be omitted. Omit it and use semicolons only when you have to. Regardless of your programming style, there are several details about JavaScript to be aware of.
In the following code, the first semicolon can be omitted

a=3;
b=4;
But if it is written in the following format, the first semicolon cannot be omitted.

a=3;b=4;
It should be noted that JavaScript does not fill in semicolons at all line breaks: JavaScript will fill in semicolons only when the code cannot be parsed normally without a semicolon. In other words (similar to the two places in the code below Exception), if the current statement and subsequent non-space characters cannot be parsed as a whole, JavaScript will fill in the semicolon at the end of the current statement, look at the following code

var a
a
          =
3
console.log(a)
javascript parses it as

var a;a=3;console.log(a);
JavaScript adds a semicolon to the first line. Without the semicolon, JavaScript cannot parse var a a in the code. The second a can be regarded as a statement "a;" alone, but JavaScript does not fill the semicolon at the end of the second line. Because it can be parsed into "a=3;" together with the third line of content.

Some statement separation rules will lead to some unexpected situations. This code break is divided into two lines, which looks like two independent statements.

var y = x f
             (a b).toString()
The parentheses in the second line and the f in the first line form a function call. JavaScript will regard this code as

var y = x f(a b).toString();
Obviously this is not the intention of the code. In order to parse the above code into two different statements, the display semicolon of the behavior must be manually filled in

Generally speaking, if a statement starts with ( [ / -, then it is very likely to be parsed together with the previous statement. Statements starting with / - are not very common, but statements starting with ( [ are very common . At least it is very common in some JavaScript encoding styles. Some programmers like to conservatively add a semicolon in front of the statement. In this way, even if the previous statement is revised and the semicolon is deleted by mistake, the current statement is still the same. It will be parsed correctly;
If the current statement and the next line of statements cannot be combined and parsed. JavaScript then pads the semicolon after the first line, which is the general rule, except for two columns. The first exception involves the returnm, birak, and continue statements, if these three keywords are followed by a newline. JavaScript will fill in semicolons at line breaks. For example

For example

return
true;
And javascript is parsed into

return;ture;
The original meaning of the code is

return true;
That is to say, there cannot be line breaks between the subsequent expressions of return, break, and continue. If line breaks are added, the program will only report an error under extremely special circumstances. And the debugging of the program is very inconvenient.

The second example involves the -- operator, these expression symbols can represent prefixes and suffixes for identifier expressions. If it is used as a suffix expression, then it is used as a suffix expression. It and the expression should be seen on one line. Otherwise the end of the line will be padded with semicolons.

Copy code The code is as follows:

x

yy

The above code is parsed into

Copy code The code is as follows:

x;
y
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn