Home >Java >javaTutorial >What are the knowledge points about java class files?

What are the knowledge points about java class files?

王林
王林forward
2023-05-05 12:22:061110browse

The Class file format uses a pseudo-structure similar to the C language structure to store data. There are only two data types in this pseudo-structure: "unsigned number" and "table".

·Unsigned numbers are basic data types. U1, u2, u4, and u8 represent unsigned numbers of 1 byte, 2 bytes, 4 bytes, and 8 bytes respectively. , unsigned numbers can be used to describe numbers, index references, quantitative values, or string values ​​encoded in UTF-8.

·A table is a composite data type composed of multiple unsigned numbers or other tables as data items. In order to facilitate differentiation, the names of all tables habitually end with "_info". Tables are used to describe data with hierarchical composite structures. The entire Class file can essentially be regarded as a table

What are the knowledge points about java class files?

The first 4 words of each Class file The section is called the Magic Number, and its only function is to determine whether this file is a Class file that can be accepted by the virtual machine. Not only Class files, many file format standards have the habit of using magic numbers for identification. For example, image formats such as GIF or JPEG have magic numbers in the file header.

The magic number of the Class file is very "romantic", and the value is 0xCAFEBABE (coffee baby?)

The 4 bytes of the magic number that follows store the version number of the Class file : The 5th and 6th bytes are the minor version number (MinorVersion), and the 7th and 8th bytes are the major version number (Major Version). The version number of Java starts from 45. The main version number of each JDK major version released after JDK 1.1 is increased by 1 (JDK 1.0~1.1 uses version numbers 45.0~45.3). Higher versions of JDK can be backward compatible with previous versions. version of the Class file, but you cannot run later versions of the Class file,

Constant Pool

Immediately after the major and minor version numbers is the constant pool entrance, constant pool It can be compared to the resource warehouse in the Class file. It is the data most associated with other projects in the Class file structure. It is usually one of the data items that takes up the largest space in the Class file. In addition, it is the first one to appear in the Class file. Table type data item.

There are two main types of constants stored in the constant pool: literals and symbolic references. Literals are closer to the concept of constants at the Java language level, such as text strings, constant values ​​declared as final, etc. Symbol references belong to the concept of compilation principles, and mainly include the following types of constants:

·Package exported or opened by the module

·Fully qualified names of classes and interfaces ( Fully Qualified Name)

·Field name and descriptor (Descriptor)

·Method name and descriptor

·Method handle and method type (Method Handle, Method Type, Invoke Dynamic)

·Dynamically-Computed Call Site and Dynamic Constant (Dynamically-Computed Call Site, Dynamically-Computed Constant)

When Java code is compiled with Javac, it is not like C and C does not have the "connection" step, but dynamically connects when the virtual machine loads the Class file (see Chapter 7 for details). In other words, the final layout information of each method and field in memory will not be saved in the Class file. If the symbolic references of these fields and methods are not converted by the virtual machine at runtime, the real memory entry address cannot be obtained, that is, It cannot be used directly by the virtual machine. When the virtual machine loads a class, it will obtain the corresponding symbol reference from the constant pool, and then parse and translate it to the specific memory address when the class is created or runtime.

Each constant in the constant pool is a table. Initially, there were 11 types of table structure data with different structures in the constant table. Later, in order to better support dynamic language calls, 4 additional dynamic types were added. Language-related constants [1] In order to support the Java modular system (Jigsaw), two constants, CONSTANT_Module_info and CONSTANT_Package_info, have been added. Therefore, as of JDK13, there are 17 different types of constants in the constant table.

What are the knowledge points about java class files?

By the way, since the methods, fields, etc. in the Class file need to refer to the CONSTANT_Utf8_info constant to describe the name, the maximum length of the CONSTANT_Utf8_info constant is the method in Java. The maximum length of field names. The maximum length here is the maximum value of length, which is the maximum value 65535 that the u2 type can express. Therefore, if a Java program defines a variable or method name that exceeds 64KB of English characters, it will not be compiled even if the rules and all characters are legal.

Classfile /D:/BaiduYunDownload/geekbang-lessons/thinking-in-spring/validation/target/classes/org/geekbang/thinking/in/spring/validation/TestClass.class

Last modified 2020-6-25; size 439 bytes

MD5 checksum 18760ee8065f9fb68d4dab7bd7450c4c

Compiled from "TestClass.java"

public class org.geekbang.thinking.in.spring .validation.TestClass

minor version: 0

major version: 52

flags: ACC_PUBLIC, ACC_SUPER

Constant pool:

   #1 = Methodref          #4.#18         // java/lang/Object."":()V

   #2 = Fieldref           #3.#19         // org/geekbang/thinking/in/spring/validation/TestClass.m:I

   #3 = Class              #20            // org/geekbang/thinking/in/spring/validation/TestClass

   #4 = Class              #21            // java/lang/Object

   #5 = Utf8               m

   #6 = Utf8               I

   #7 = Utf8               

   #8 = Utf8               ()V

   #9 = Utf8               Code

  #10 = Utf8               LineNumberTable

  #11 = Utf8               LocalVariableTable

  #12 = Utf8               this

  #13 = Utf8               Lorg/geekbang/thinking/in/spring/validation/TestClass;

  #14 = Utf8               inc

  #15 = Utf8               ()I

  #16 = Utf8               SourceFile

  #17 = Utf8               TestClass.java

  #18 = NameAndType        #7:#8          // "":()V

  #19 = NameAndType        #5:#6          // m:I

  #20 = Utf8               org/geekbang/thinking/in/spring/validation/TestClass

  #21 = Utf8               java/lang/Object

{

  public org.geekbang.thinking.in.spring.validation.TestClass();

    descriptor: ()V

    flags: ACC_PUBLIC

    Code:

      stack=1, locals=1, args_size=1

         0: aload_0

         1: invokespecial #1                  // Method java/lang/Object."":()V

         4: return

      LineNumberTable:

        line 3: 0

      LocalVariableTable:

        Start  Length  Slot  Name   Signature

            0       5     0  this   Lorg/geekbang/thinking/in/spring/validation/TestClass;

  public int inc();

    descriptor: ()I

    flags: ACC_PUBLIC

    Code:

      stack=2, locals=1, args_size=1

         0: aload_0

      1: getfield                                                                                                                                                                                                                  Table:

line 7: 0

LocalVariableTable:

Start Length Slot Name Signature

0 7 0 this Lorg/geekbang/thinking/in/ spring/validation/ TestClass;

}

SourceFile: "TestClass.java"

After the constant pool ends, the next 2 bytes represent the access flag (access_flags). This flag Used to identify some class or interface level access information, including: whether this Class is a class or an interface; whether it is defined as a public type; whether it is defined as an abstract type; if it is a class, whether it is declared as final;

The class index, parent class index and interface index set are all arranged in order after the access flag. The class index and parent class index are represented by two u2 type index values. They each point to a class descriptor constant of type CONSTANT_Class_info, through CONSTANT_Class_info The index value in the constant of type can be found in the fully qualified name string defined in the constant of type CONSTANT_Utf8_info.

Until the emergence of Lambda expressions and interface default methods in JDK 8, the InvokeDynamic instruction has a place in the Class file generated by the Java language.

So the new ones in JDK 8 This attribute allows the compiler to

(add the -parameters parameter when compiling) to write the method name into the Class file, and MethodParameters is an attribute of the method table, which is the same as the Code attribute. It is flat and can be obtained through the reflection API at runtime.

·Load a local variable into the operation stack: iload

·Store a value from the operand stack to the local variable table: istore

·Load a constant into Operand stack: bipush

iload_, which represents the instructions iload_0, iload_1, iload_2 and iload_3

·Addition instructions: iadd, ladd, fadd, dadd

·Subtraction instructions: isub, lsub, fsub, dsub

·Multiplication instructions: imul, lmul, fmul, dmul

·Division instructions: idiv, ldiv, fdiv, ddiv

·Remainder instructions: irem, lrem, frem, drem

·Replacement instructions: ineg, lneg, fneg, dneg

·Displacement instructions: ishl, ishr, iushr, lshl, lshr, lushr

·Bitwise OR instructions: ior, lor

·Bitwise AND instructions: iand, land

·Bitwise XOR instructions: ixor, lxor

·Local variable auto-increment instruction: iinc

·Comparison instruction: dcmpg, dcmpl, fcmpg, fcmpl, lcmp

The invokespecial instruction was changed in JDK 1.0.2 Semantics, JDK 7 adds the invokedynamic instruction and prohibits the ret and jsr instructions.

Class life cycle

Loading->Connection (verification, preparation, parsing)->Initialization->Use->Uninstall.

The order of the five stages of loading, verification, preparation, initialization and unloading is determined. The loading process of the type must start step by step in this order, while the parsing stage is not necessarily: it is in some cases The following can be started after the initialization phase. This is to support the runtime binding feature of the Java language (also known as dynamic binding or late binding).

public static final int value = 123;

When compiling, Javac will generate the ConstantValue attribute for the value. During the preparation phase, the virtual machine will assign the value to 123 based on the setting of Con-stantValue. The working process of the parent delegation model is: if a class loader receives a class loading request, it will not try to load the class itself first, but delegates the request to the parent class loader to complete. , this is true for every level of class loader, so all loading requests should eventually be sent to the top-level startup class loader. Only when the parent loader feedbacks that it cannot complete the loading request (there is no such load request in its search scope) When the required class is found), the subloader will try to complete the loading by itself

First, the extension class loader (Extension Class Loader) is replaced by the platform class loader (Platform Class Loader). This is actually a very logical change. Since the entire JDK is built based on modularity (the original rt.jar and tools.jar were split into dozens of JMOD files), the Java class library is naturally sufficient. For scalable needs, there is no need to retain the \lib\ext directory. The previous mechanism of using this directory or the java.ext.dirs system variable to extend JDK functions has no further value. It is used to load this The extension class loader of some class libraries has also completed its historical mission.

All dispatch actions that rely on static types to determine the method execution version are called static dispatch. The most typical application of static dispatch is method overloading. Static dispatch occurs during the compilation phase, so the action to determine static dispatch is not actually performed by the virtual machine. This is why some materials choose to classify it as "parsing" rather than "dispatch".

The Java virtual machine supports the following 5 methods to invoke bytecode instructions, which are:

·invokestatic. Used to call static methods.

·invokespecial. Used to call the instance constructor () method, private methods and methods in the parent class.

·invokevirtual. Used to call all virtual methods.

·invokeinterface. Used to call interface methods, an object that implements the interface will be determined at runtime.

·invokedynamic. The method referenced by the call site qualifier is first dynamically resolved at runtime, and then the method is executed. The dispatch logic of the first four call instructions is solidified inside the Java virtual machine, while the dispatch logic of the invokedynamic instruction is determined by the boot method set by the user.

As long as the method can be called by the invokestatic and invokespecial instructions, the unique calling version can be determined in the parsing phase. Methods that meet this condition in the Java language include static methods, private methods, instance constructors, and parent classes. There are 4 methods, plus the method modified by final (although it is called using the invokevirtual instruction), these 5 method calls will resolve the symbol reference

directly to the method when the class is loaded. Quote. These methods are collectively called "Non-Virtual Method" (Non-Virtual Method), on the contrary, other methods are called "Virtual Method" (Virtual Method).

The parsing call must be a static process, which is completely determined during compilation. During the parsing phase of class loading, all the symbol references involved will be converted into clear direct references, and there is no need to delay it until the runtime. . The other main method calling form: dispatch (Dispatch) calling is much more complicated. It may be static or dynamic. According to the number of cases based on dispatch, it can be divided into single dispatch and multi-dispatch [1]. The combination of these two types of dispatch methods constitutes four dispatch combinations: static single dispatch, static multi-dispatch, dynamic single dispatch, and dynamic multi-dispatch. Let's take a look at how method dispatch is performed in the virtual machine.

The code deliberately defines two variables with the same static type but different actual types, but the virtual machine (or, to be precise, the compiler) uses parameters when overloadingstatic typeRather than the actual type, it is used as the basis for

determination. Since the static type is known at compile time, during the compilation stage, the Javac compiler determines which overloaded version will be used based on the static type of the parameter, so it selects sayHello (Human) as the calling target, and writes the symbolic reference of this method. into the parameters of the two invokevirtual instructions in the main() method.

All dispatch actions that rely on static types to determine the execution version of a method are called static dispatch. The most typical application of static dispatch is method overloading. Static dispatch occurs during the compilation phase, so the action to determine static dispatch is not actually performed by the virtual machine. This is why some materials choose to classify it as "parsing" rather than "dispatch".

It can be seen that the overload priority of variable length parameters is the lowest. Fields never participate in polymorphism. When a method of a class accesses a field with a certain name, the name refers to the field that the class can see.

Key points

It is precisely because the first step in the execution of the invokevirtual instruction is to determine the actual type of the receiver at runtime, so the invokevirtual instruction in the two calls is not It ends by parsing the symbolic reference of the method in the constant pool to a direct reference, and selecting the method version based on the actual type of the method receiver. This process is the essence of method overriding in the Java language. We call this dispatch process that determines the method execution version based on the actual type at runtime called dynamic dispatch. The root of polymorphism lies in the execution logic of the virtual method call instruction invokevirtual. Naturally, the conclusion we draw will only be valid for methods and not for fields, because fields do not use this instruction.

The Java language is a static multi-dispatch and dynamic single-dispatch language.

For the convenience of program implementation, methods with the same signature should have the same index number in the virtual method table of the parent class and subclass, so that when the type changes, only the virtual method table that is looked up needs to be changed. , the required entry address can be converted by index from different virtual method tables. The virtual method table is generally initialized during the connection phase of class loading. After the initial values ​​of the variables of the class are prepared, the virtual machine also initializes the virtual method table of the class.

Dynamic type language support

The number of bytecode instruction sets of the Java virtual machine. Since the advent of Sun's first Java virtual machine, there has been only one new instruction in more than 20 years, which is the bytecode instruction set with the release of JDK 7. The first new member of the code is the invokedynamic instruction. This newly added instruction is one of the improvements made to achieve the project goal of JDK 7: to implement dynamically typed language (Dynamically Typed Language) support. It is also a technical reserve for the smooth implementation of Lambda expressions in JDK 8.

What is a dynamically typed language [1]? The key feature of a dynamically typed language is that the main process of type checking is performed during the runtime rather than the compile time. There are many languages ​​that meet this feature. Commonly used ones include: APL, Clojure, Erlang, Groovy, javaScript, Lisp, Lua , PHP, Prolog, Python, Ruby, Smalltalk, Tcl, and more. In contrast, languages ​​that perform type checking during compilation, such as C and Java, are the most commonly used statically typed languages. Variables have no type but only variable values ​​have types

Providing direct support for dynamic types at the Java virtual machine level has become a problem that must be solved for the development of the Java platform. This is the invokedynamic instruction in the JSR-292 proposal in JDK 7 And the technical background for the emergence of the java.lang.invoke package.

The java.lang.invoke package newly added in JDK 7 [1] is an important part of JSR 292. The main purpose of this package is to rely solely on symbol references to determine In addition to the calling target method, a new mechanism for dynamically determining the target method is provided, called "Method Handle".

·Reflection and MethodHandle mechanisms are essentially simulating method calls, but Reflection is simulating method calls at the Java code level, while MethodHandle is simulating method calls at the bytecode level.

In the Tomcat directory structure, you can set up 3 groups of directories (/common/*, /server/* and /shared/*, but they are not necessarily open by default, and only the /lib/* directory may exist) Used to store Java class libraries, in addition to the "/WEB-INF/*" directory of the Web application itself, a total of 4 groups. Place the Java class library in these four groups of directories. Each group has an independent meaning, which are:

·Place it in the /common directory. The class library can be used by Tomcat and all web applications.

·Place it in the /server directory. The class library can be used by Tomcat and is invisible to all web applications.

·Place it in the /shared directory. The class library can be used by all web applications, but is not visible to Tomcat itself.

·Place it in the /WebApp/WEB-INF directory. The class library can only be used by the web application and is not visible to Tomcat or other web applications.

In order to support this directory structure and load and isolate the class libraries in the directory, Tomcat has customized multiple class loaders, which are implemented according to the classic parent delegation model

What are the knowledge points about java class files?

Common class loader, Catalina class loader (also known as Server class loader), Shared class loader and Webapp class loader are Tomcat’s own class loaders. They load the Java class libraries in /common/*, /server/*, /shared/* and /WebApp/WEB-INF/* respectively. There are usually multiple instances of WebApp class loaders and JSP class loaders. Each Web application corresponds to a WebApp class loader, and each JSP file corresponds to a JasperLoader class loader.

The loading scope of JasperLoader is only the Class file compiled by this JSP file. The purpose of its existence is to be discarded: when the server detects that the JSP file has been modified, it will replace the current JasperLoader instance, and create a new JSP class loader to implement the HotSwap function of the JSP file.

The above is the detailed content of What are the knowledge points about java class files?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete