Simply put, extern "C" is a way for C++ to declare or define C language symbols for compatibility with C. It's easy to say, but it still takes some trouble to understand. First, we have to start with the difference between C++ and C.
Symbols
Everyone knows that from code to executable program, it needs to go through two processes: compilation and linking. The compilation phase will do syntax detection and code expansion. In addition, it will also do one thing, which is to convert variables into symbols. , when linking, it is actually positioned through symbols. When the compiler compiles C and C++ code, the process of converting variables into symbols is different. The compiler used in this article is gcc4.4.7
Let’s first look at a simple code
/* hello.c */ #include <stdio.h> const char* g_prefix = "hello "; void hello(const char* name) { printf("%s%s", g_prefix, name); }
Note that the file name here is hello.c. We execute and compile gcc -c hello.c to get the target file hello.o. Use nm to view the symbol table of the target file under Linux and get the following results (the $ symbol represents the shell command prompt)
$ nm hello.o 0000000000000000 D g_prefix 0000000000000000 T hello U printf
This is the compiled symbol list of the C code, of which the third column is the compiled symbol name. We mainly look at The compiled symbolic names of the self-defined global variable g_prefix and function hello are the same as those in the code. We rename hello.c to hello.cpp, recompile gcc -c hello.cpp to get hello.o, and check it with nm. The result is as follows
0000000000000000 T _Z5helloPKc U __gxx_personality_v0 0000000000000000 D g_prefix U printf
This is the symbol list after the C++ code is compiled. Gcc will automatically compile it according to the file suffix name to identify C and C++ codes. At this time, we found that the symbol of g_prefix has not changed, but the symbol of function hello has changed to _Z5helloPKc. This shows that gcc processes C and C++ codes differently. For C Code, the symbolic name of the variable is the variable itself (in the early days, the compiler would add an underscore _ before the C code variable, but now it does not do so by default. You can use the compilation options -fno-leading-underscore and -fleading-underscore when compiling. Explicit setting), and for C++ code, if it is a data variable and there is no nesting, the symbol name is itself. If the variable name is nested (in a namespace or class) or is a function name, the symbol name will follow the following rules To process
1. The symbol starts with _Z
2. If there is nesting, it is followed by N, then the name of the namespace, class, and function. The number before the name is the length and ends with E
3 , If there is no nesting, it is directly the name length followed by the name
4. Finally, there is the parameter list, the corresponding relationship between types and symbols is as follows
int -> i float -> f double -> d char -> c void -> v const -> K * -> P
This way it is easy to understand why void hello (const char*) in C++ code after compilation The symbol is _Z5helloPKc (PKc is translated into type from right to left as char const *, which is the internal representation of the compiler. The representation we are used to is const char*, which is the same), c++filt The tool can reverse the name from the symbol. The method of use is c++filt _Z5helloPKc
It is also easy to understand why C++ supports function overloading but C does not, because C++ adds the parameter type of the function when modifying the function into a symbol. , but C does not, so under C++, even if the function names are the same, as long as the parameters are different, their symbol names will not conflict. We can verify this relationship between variable names and symbols through the following example.
/ * filename : test.cpp */ #include <stdio.h> namespace myname { int var = 42; } extern int _ZN6myname3varE; int main() { printf("%d\n", _ZN6myname3varE); return 0; }
Here we define the global variable var in the namespace namespace. According to the previous content, it will be modified as the symbol _ZN6myname3varE, and then we manually declare the external variable _ZN6myname3varE and print it out. Compile and run, its value is exactly the value of var
$ gcc test.cpp -o test -lstdc++ $ ./test 42
extern "C"
With the concept of symbols, it is easy to look at the usage of extern "C"
extern "C" { int func(int); int var; }
It means to tell the compiler to The code in parentheses after extern "C" is treated as C code. Of course, we can also declare it in a single statement
extern "C" int func(int); extern "C" int var;
, thus declaring func and var of type C. Many times we write a header file to declare some C language functions, and these functions may be called by C and C++ code. When we call them from C++ code, we need to add extern "C" to the header file, otherwise when C++ compiles The symbol will not be found, and extern "C" cannot be added when calling C code, because C does not support such syntax. The common processing method is as follows. Let's take the C library function memset as an example
#ifdef __cplusplus extern "C" { #endif void *memset(void*, int, size_t); #ifdef __cplusplus } #endif
Among them, __cplusplus is a macro defined by the C++ compiler. If this code is compiled with C++, then memset will be declared in extern "C". If it is compiled with C code, it will be declared directly because __cplusplus is not defined. , so there will be no syntax errors. This technique is often used in system header files.
For more detailed explanations of extern "C" usage in C++ and related articles, please pay attention to the PHP Chinese website!