Home >Backend Development >C#.Net Tutorial >Use C++ to analyze C++ syntax format
Foreword
Recently, C++ projects require the use of scripts, which is related to how to bind objects to the script running environment. Because multiple scripting languages are used, existing binding technologies cannot meet the needs. Therefore, we can only seek to parse the C++ header file and then bind according to the description. At first, I found that boost has a set, but boost is too bloated and has certain requirements for the establishment of the development environment after the project. I strive for the simplest project management, and boost is not suitable. Write your own set.
Text
Summary of the syntax format of C++
Except for compiler directives starting with # and functions, all must end with a semicolon
Code blocks must be enclosed in braces/flowers except if, do/while, and for in a single statement Brackets "{}"
There are 8 types of code blocks, namespace, global, class, structure, global function, member function, lambda, unnamed code block within a function
The namespace is the same as the global, only add "name" before the declaration Space::” prefix
The difference between class declaration block and global block is friend declaration, member access rights
Templates support classes and functions
Functions and templates have parameter lists
The declaration ending with a semicolon can be a built-in type (int, double, etc.), classes, template classes, typedef types, function pointers, lambda
functions can have the same name
Member functions of a class that access other members can be declared after the function is defined, unlike global functions It must have been declared, so it is impossible to use a pointer to scan whether the C++ file is legal. If it is a classed enumeration (enum class), it does not need to be visible to the scope where it is declared, otherwise the members must be added to the scope at the same time.
I don’t plan to support it. It is not necessary for the requirements.
Namespace
Template
Function body
I don’t want to support multiple variables separated by commas. I don’t want to support
Type verification
Default parameters
union
Enumeration does not judge name conflicts
lambda
Variable names cannot start with numbers
Inheritance of classes
Function pointers
Principles of development
Not cross-platform yet, only VS, do not use system APIs like this Suitable for modification to cross-platform
Using C++
iter only advances but never retreats
When encountering a syntax error or the end of the file, an exception is thrown
When encountering {, it enters block processing
processing; the ending language unit, the function must be itself After processing; and the previous content, return
The member function definition code is not analyzed, because the first principle cannot be achieved, the member function may refer to other member functions that have not been declared yet
It is not the best performance, but it can This structure is optimized to the extreme
We do not seek complete analysis, but on this framework we can analyze all features of C++11 and higher versions
No comments, English is not good, and Chinese comments are not suitable for globalization
Structure description
variant, method, type, comment, enumeration, enumeration_value inherit from object, they can all belong to the global or class or structure.
document represents a c++ compilation unit.
context represents a context, which is a queue that can be searched upward. After the document completes parsing, the parsed variables, functions, and types are stored.
reader, file reader, a forward char iterator, can be replaced by istreambuf_iterator
Use
main function has _DEBUG macro protection, it is recommended to compile it as a library release.
#include
try
{
auto result = cpp_analysis::analysis("[cpp_header_file]");
// todo
}
catch (logic_error& e){
// todo:
Project location
https://github.com/FettLuo/cpp_analysis[open in new tab]