search
HomeBackend DevelopmentPHP TutorialA brief discussion on PHP automated code audit technology, a brief discussion on PHP automated audit_PHP tutorial

A brief talk about PHP automated code audit technology, a brief talk about PHP automated audit

Original source: exploit Welcome to share the original to Bole Headlines

0×00

Since there is really nothing to update on the blog, I will summarize what I am doing so far and treat it as a blog, mainly talking about some of the technologies used in the project. There are currently many PHP automated audit tools on the market, including open source ones such as RIPS and Pixy, and commercial versions such as Fortify. RIPS currently only has the first version. Since it does not support PHP object-oriented analysis, the effect is not very satisfactory now. Pixy is a tool based on data flow analysis, but only supports PHP4. Fortify is a commercial version. Due to this limitation, research on it is impossible. Domestic research on PHP automatic auditing is generally done by companies. Currently, most of the tools use simple token flow analysis or are more direct and crude, using regular expressions for matching, and the effect will be very average.

0×01

The technology I want to talk about today is an implementation idea for PHP automated auditing based on static analysis, which is also the idea in my project. In order to carry out more effective variable analysis and taint analysis, and to cope with various flexible syntax expressions in PHP scripts, the effect of regular expressions is definitely not ideal. The idea I introduced is based on code static analysis technology and data Auditing of streaming analytics technology.

First of all, I think an effective audit tool at least contains the following modules:

1. Compile front-end module
Compile front-end module mainly uses abstract syntax tree construction and control flow graph construction methods in compilation technology to convert source code files into a form suitable for back-end static analysis.

2. Global information collection module
This module is mainly used to collect unified information on the analyzed source code files, such as collecting the definitions of how many classes there are in the audit project, and collecting the method names, parameters, And the starting and ending line numbers of the method definition code block are collected to speed up subsequent static analysis.

3. Data flow analysis module
This module is different from the data flow analysis algorithm in compilation technology. In the project, it pays more attention to the processing of the characteristics of the PHP language itself. When the call of a sensitive function is discovered during the inter-process and intra-process analysis of the system, data flow analysis is performed on the sensitive parameters in the function, that is, the specific changes of the variable are tracked to prepare for subsequent taint analysis.

4. Vulnerable code analysis module
This module performs taint data analysis based on global variables, assignment statements and other information collected by the data flow analysis module. Mainly targeting dangerous parameters in sensitive sinks, such as the first parameter in the mysql_query function, the corresponding data flow information is obtained through backtracking. If the parameter is found to have signs of user control during the backtracking process, it will be recorded. If the dangerous parameter has a corresponding code, the purification operation must also be recorded. Complete stain analysis by tracking and analyzing data on dangerous parameters.

0×02

With the module, how to implement an effective process to implement automated auditing? I used the following process:

The general process of the analysis system is as follows:

1. Framework initialization

First, initialize the analysis framework, mainly to collect information about all user-defined classes in the source code project to be analyzed, including class names, class attributes, class method names, and file paths where the classes are located.
These Records are stored in the global context class Context, which is designed using the singleton pattern and is resident in memory to facilitate subsequent analysis and use.

2. Determine Main File

Secondly, determine whether each PHP file is a Main file. In the PHP language, there is no so-called main function. Most PHP files in the Web are divided into two types: call and definition. PHP files of the definition type are used to define some business classes, tool classes, tool functions, etc., and are not provided to The user accesses the PHP file provided to the calling type for calling. What actually handles user requests is the calling type of PHP file, such as the global index.php file. Static analysis is mainly aimed at the PHP file that handles the call type requested by the user, that is, the Main File. The basis for judgment is:
Based on the completion of AST analysis, judge whether the number of code lines of class definitions and method definitions in a PHP file exceeds a range of all code lines in the file. If so, it is regarded as a defined type. The PHP file, otherwise the Main File, is added to the list of file names to be analyzed.

3. Construction of AST abstract syntax tree

This project is developed based on the PHP language itself. For the construction of its AST, we refer to the current excellent implementation of PHP AST construction——PHP Parser.
This open source project is developed based on the PHP language itself and can parse most of PHP's structures such as if, while, switch, array declaration, method call, global variables and other grammatical structures. It can complete part of the compilation front-end processing of this project very well.

4. CFG flow graph construction

Use the CFGBuilder method in the CFGGenerator class. The method is defined as follows:

The specific idea is to use recursion to build CFG. First, input the nodes collection obtained by traversing the AST. During the traversal, the type of the elements (nodes) in the collection is judged, such as whether it is a branch, jump, end, etc. statement, and the CFG is constructed according to the node type.
Here, the jump conditions (conditions) for branch statements and loop statements should be stored on the edges (Edge) in CFG to facilitate data flow analysis.

5. Collection of data flow information

For a block of code, the most effective information worth collecting is assignment statements, function calls, constants (const define), and registered variables (extract parse_str).
The function of the assignment statement is for subsequent variable tracking. In the implementation, I used a structure to represent the assigned value and location. Other data information is identified and obtained based on AST. For example, in a function call, determine whether the variable is escaped, encoded, etc., or whether the called function is a sink (such as mysql_query).

6. Variable purification and encoding information processing

$clearsql = addslashes($sql) ;
Assignment statement, when the right side is a filter function (user-defined filter function or built-in filter function), the return value of the calling function is purified, that is, the purification of $clearsql Tags plus addslashes.
Discover function calls and determine whether the function name is a safe function configured in the configuration file.
If yes, add the sanitization tag to the location symbol.

7. Inter-process analysis

If a call to a user function is found during the audit, inter-process analysis must be performed at this time. The code block of the specific method must be located in the analyzed project and the variables must be brought in for analysis.
The difficulty lies in how to perform variable backtracking, how to deal with methods with the same name in different files, how to support class method call analysis, and how to save user-defined sinks (such as calling the exec function in myexec. If there is no valid purification, then myexec should also be regarded as a dangerous function), how to classify user-defined sinks (such as SQLI XSS XPATH, etc.).

The processing flow is as follows:

8. Taint analysis

After the above process, the last thing to be done is taint analysis, which mainly focuses on some risk functions built into the system, such as echo that may cause XSS. And it is necessary to conduct effective analysis of the dangerous parameters in the dangerous function. These analyzes include determining whether effective purification has been carried out (such as escaping, regular matching, etc.), and formulating algorithms to retrace the previous assignment or other transformation of the variable. This is undoubtedly a test of the engineering capabilities of security researchers and is also the most important stage of automated auditing.

0×03

Through the above introduction, you can see that there are many pitfalls to implement your own automated audit tool. I also encountered many difficulties in my attempts, and static analysis does have certain limitations. For example, the string transformation process that can be easily obtained in dynamic analysis is difficult to implement in static analysis. This is not technically possible. The breakthrough is caused by the limitations of static analysis itself. Therefore, if pure static analysis wants to achieve low false positives and false negatives, after all, some dynamic ideas should be introduced, such as simulating the code in eval and character analysis. String transformation functions and regular expressions for processing, etc. Also, for some MVC-based frameworks, such as CI frameworks, the code is very scattered. For example, the data purification code is placed in the extension of the input class. For PHP applications like this, I think it is difficult to achieve a universal audit framework. It should To be treated individually.

The above is just a rough summary of my current attempts (currently not fully implemented) to share. After all, college dogs are not professionals. I hope it can inspire more and more security researchers to pay attention to this field.

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/990264.htmlTechArticleA brief discussion on PHP automated code audit technology, a brief discussion on PHP automated audit Original source: exploit Welcome to share the original to Bole Toutiao 000 Since there is really nothing to update on the blog, I will...
Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
php怎么把负数转为正整数php怎么把负数转为正整数Apr 19, 2022 pm 08:59 PM

php把负数转为正整数的方法:1、使用abs()函数将负数转为正数,使用intval()函数对正数取整,转为正整数,语法“intval(abs($number))”;2、利用“~”位运算符将负数取反加一,语法“~$number + 1”。

php怎么实现几秒后执行一个函数php怎么实现几秒后执行一个函数Apr 24, 2022 pm 01:12 PM

实现方法:1、使用“sleep(延迟秒数)”语句,可延迟执行函数若干秒;2、使用“time_nanosleep(延迟秒数,延迟纳秒数)”语句,可延迟执行函数若干秒和纳秒;3、使用“time_sleep_until(time()+7)”语句。

php字符串有没有下标php字符串有没有下标Apr 24, 2022 am 11:49 AM

php字符串有下标。在PHP中,下标不仅可以应用于数组和对象,还可应用于字符串,利用字符串的下标和中括号“[]”可以访问指定索引位置的字符,并对该字符进行读写,语法“字符串名[下标值]”;字符串的下标值(索引值)只能是整数类型,起始值为0。

php怎么除以100保留两位小数php怎么除以100保留两位小数Apr 22, 2022 pm 06:23 PM

php除以100保留两位小数的方法:1、利用“/”运算符进行除法运算,语法“数值 / 100”;2、使用“number_format(除法结果, 2)”或“sprintf("%.2f",除法结果)”语句进行四舍五入的处理值,并保留两位小数。

php怎么根据年月日判断是一年的第几天php怎么根据年月日判断是一年的第几天Apr 22, 2022 pm 05:02 PM

判断方法:1、使用“strtotime("年-月-日")”语句将给定的年月日转换为时间戳格式;2、用“date("z",时间戳)+1”语句计算指定时间戳是一年的第几天。date()返回的天数是从0开始计算的,因此真实天数需要在此基础上加1。

php怎么读取字符串后几个字符php怎么读取字符串后几个字符Apr 22, 2022 pm 08:31 PM

在php中,可以使用substr()函数来读取字符串后几个字符,只需要将该函数的第二个参数设置为负值,第三个参数省略即可;语法为“substr(字符串,-n)”,表示读取从字符串结尾处向前数第n个字符开始,直到字符串结尾的全部字符。

php怎么替换nbsp空格符php怎么替换nbsp空格符Apr 24, 2022 pm 02:55 PM

方法:1、用“str_replace(" ","其他字符",$str)”语句,可将nbsp符替换为其他字符;2、用“preg_replace("/(\s|\&nbsp\;||\xc2\xa0)/","其他字符",$str)”语句。

php怎么查找字符串是第几位php怎么查找字符串是第几位Apr 22, 2022 pm 06:48 PM

查找方法:1、用strpos(),语法“strpos("字符串值","查找子串")+1”;2、用stripos(),语法“strpos("字符串值","查找子串")+1”。因为字符串是从0开始计数的,因此两个函数获取的位置需要进行加1处理。

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool