Home  >  Article  >  Backend Development  >  A brief introduction to distributed tracing in PHP

A brief introduction to distributed tracing in PHP

巴扎黑
巴扎黑Original
2017-08-17 09:03:331402browse


Abstract: Since implementing microservices, we have encountered many problems. The biggest problem is how to troubleshoot faults. Service-oriented interfaces usually rely on multiple services. The slowness of the dependent interfaces will directly affect the service quality of the interfaces. Slowness caused by this kind of dependence is very common online, but it is not easy to troubleshoot. After all...

Since the implementation of microservices, we have encountered many problems. The biggest problem is how to troubleshoot faults. Service-oriented interfaces usually rely on multiple services. The slowness of the dependent interfaces will directly affect the service quality of the interfaces.

This kind of slowness caused by dependence is very common online, but it is not easy to troubleshoot. The reason is that a large number of log developers track online through logs, which is not very intuitive, and some Company developers cannot see the specific execution status online. Generally speaking, these small probability failures online represent hidden dangers in the system. When the traffic increases, these hidden dangers will be amplified and even directly lead to large-scale online failures. In order to avoid similar things, we need to do a lot of things. The most intuitive is to use Distributed tracing systems for statistical analysis.

We often see experts talking about how to optimize online performance and how to improve performance. In fact, there is an important link that they did not mention. How do they discover low-probability faults? Distributed tracking systems are very common in large Internet companies, but small and medium-sized companies do not have the technical strength to implement this system. From our point of view, even if the traffic is very small, the system is still very important to the company and we need to strengthen it. Only by being able to find problems can we solve them. This is the purpose that I have always implemented.

The specific implementation of the distributed tracking system has certain technical difficulties. It is necessary to achieve performance capture, log writing, log collection, log transmission, log storage, log index, log analysis of logs, final merger display, and require the system to request the system. Able to cope with the impact of large flow systems. For example, each request generates 1k logs per interface, then the QPS 2000 server will generate 2M logs. If a request relies on 5 interfaces, then it will be 10M logs per second. When the online business is more complex and the traffic is larger, time, this value will increase.

large Internet companies have many distributed tracking systems that can bear billions of traffic, but for small companies, this architecture is very affordable. Many of them, such as dependent distributed message systems, distributed storage, and distributed storage,, distributed storage, and distributed storage, and distributed storage, and distributed storage,, and distributed storage, and distributed storage,, and distributed storage,, and distributed storage, and distributed storage,, and distributed storage, and distributed storage,, and distributed storage,, and distributed storage,, and distributed storage,, and distributed storage,, and distributed storage,, and distributed storage. Distributed computing, these alone will use at least 6 or more servers, which is not cost-effective for ordinary small companies.

This time we have two types of open source distributed tracing. One is a stand-alone version for small and medium-sized Internet companies. It can support PV 2000w business systems (such as payment systems). There is also a distributed tracking system that supports distributed billions of PV. Currently, the stand-alone version of Fiery has just been opened (https://github.com/weiboad/fiery). This version is designed for use by small and medium-sized enterprises. The entire project is a Jar package that can be used out of the box. As long as there is Java8 runtime, it can be used directly. , of course the system needs to simply do a burying job. The C++ distributed version relies on many things and requires certain capabilities for operation and maintenance personnel. The stand-alone version will be released later depending on the situation. These core trading systems, which are completely open source and have sensitive data inside, are also fully available.

There are currently multiple methods of distributed tracking on the market, some of which are used internally by the company, and some are small-scale free and large-scale paid services. Common distributed tracing records the performance of each block through statistical methods. The methods we currently provide are not exactly the same as those on the market. We have made a lot of simplifications through continuous experiments, retaining only the functions we think are truly practical. We designed the system for distributed monitoring of key systems. Such as payment system and transaction system.

We record the specific situation, return value, specific performance and other information of each request. Through table analysis, we can quickly discover online dependent interface performance (third-party or unburied interface performance statistics), The performance ranking analysis of the hidden interfaces was also conducted independently. By viewing the analysis table, we can quickly find the slowest interface to request playback and analyze the reasons for slow online performance. Through practice, we have found that in many cases the slowness of the data resources that PHP relies on leads to poor performance of the PHP interface. Therefore, the focus is on relying on resources. Users can add other information according to their own needs, which can reduce a large number of useless logs and make it more frugal.

This open source Fiery mainly has three parts, PHP intrusive point library, streamlined log monitoring push module, and server. These three achieve distributed tracking of a website with a PV of less than 2000w.

The buried point library will generate a Traceid (UUID) at the entrance. This Traceid hides the IP address of the entrance server and the request time. All subsequent logs will be marked with this UUID. After log collection, all relevant logs will be stored according to this UUID. The buried point library will be responsible for receiving the Traceid sent by other requests and sending and maintaining the RPCID during runtime. The RPCID is a hierarchical counter through which we can directly restore the order and hierarchy of the calling relationship and display it to the developer. In addition, if an Exception occurs during PHP operation, it will be captured and recorded by the buried point library for the server to perform deduplication statistics. Finally, these logs will be logged to the local disk of the server. Due to some reasons, when multiple PHP processes write a file at the same time, the order may occasionally occur. We now log the logs based on the process ID plus the project name as the file name.

We have implemented a simple version of Fiery log capture and transmission. This is to simplify the work of operation and maintenance personnel. There are indeed many open sources that provide similar functions, but they need to rely on other environments. There is a certain burden for Wei. We also have an experimental PHP log capture and transmission service, but it is still an experimental function. It is expected that there will be certain defects. Users can participate in debugging and improvement.

We have done a lot of work on Fiery’s server side and built-in Lucene and Rocksdb to index and store requests respectively. We have also done some work on memory statistics. At present, our statistical dimensions are fixed. We only count the responses of local interfaces, dependent interfaces, Mysql, and Curl. We also provide call relationship playback and error log alert deduplication statistics. Through these functions, performance faults and system abnormalities at key points on the line can be quickly discovered. Currently it is only a stand-alone version, and I can further expand it into a simpler distributed mode if needed in the future.

The above services are not just served online. At present, we have made a lot of interesting attempts inside us. For example, the environment of the QA test is connected. After the test is completed , developers can use Traceid to find all the call process, parameters, return conditions, and performance of this request, which can be visually viewed for easy analysis. The same can be done for unit testing before going online. Some time ago, when I posted on Weibo to promote Fiery, someone mentioned that it could be used as a honeypot to view the hacker intrusion process and specific details. Follow-up functions still need to be explored and improved by everyone. This system is designed to fill the gaps in the PHP ecosystem.

The above is the detailed content of A brief introduction to distributed tracing in PHP. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn