Home  >  Article  >  Backend Development  >  Million-level PHP website architecture toolbox_PHP tutorial

Million-level PHP website architecture toolbox_PHP tutorial

WBOY
WBOYOriginal
2016-07-21 14:51:57730browse

After learning about the world’s largest PHP site, Facebook’s backend technology, today we will learn about the website architecture of a million-level PHP site: Poppen.de. Poppen.de is a social networking site in Germany. It is a small website compared to Facebook and Flickr, but it has a good architecture and integrates many technologies, such as Nigix, MySql, CouchDB, Erlang, Memcached, RabbitMQ, PHP, Graphite, Red5 and Tsung.

Poppen.de currently has 2 million registered users, 20,000 concurrent users, 200,000 private messages per day, and 250,000 logins per day. The project team has 11 developers, two designers, and two system administrators. The site's business model uses a freemium model, and users can use functions such as searching for users, sending messages to friends, and uploading pictures and videos.

If users want to enjoy unlimited sending messages and uploading pictures, they have to pay for different types of membership services according to their needs. The same strategy is used for video chat and other services on the website.

Nginx

All services of Poppen.de are based on Nginx service. The front-end has two Nginx servers serving a load of 150,000 requests per minute at peak times. Each machine is four years old and has only one CPU and 3GB of RAM. Poppen.de has three independent image servers, and three Nginx servers provide *.bilder.poppen.de with 80,000 request services per minute.

A cool design in the Nginx architecture is that many requests are handled by Memcached, so requests get content from the cache without directly accessing the PHP machine. For example, user profile (user profile) is content that requires intensive processing on the website. If all user profile pages are cached on Memcached, then the request will directly obtain the content from Memcached. Poppen.de's Memcached can handle 8,000 requests per minute.

There are three Nginx image servers in the architecture to provide local image caching, and users upload images to a central file server. When an image is requested from one of the three Nginx, if the image does not exist locally on the server, it will be downloaded from the central file server to the server for caching and service. This load-balanced distributed image server architecture design can reduce the load on primary storage devices.

PHP-FPM

This website runs on PHP-FPM. There are a total of 28 PHP machines with dual CPUs and 6GB of memory, each running 100 PHP-FPM worker threads. Using PHP5.3.x with APC enabled. PHP5.3 can reduce CPU and memory usage by more than 30%.

 The program code is developed based on the Symfony1.2 framework. One is that external resources can be used, and the other is that it can improve project development progress, and at the same time, it can make it easier for new developers to join the team on a well-known framework. Although nothing is perfect, you can get a lot of benefits from the Symfony framework, allowing the team to focus more on Poppen.de's business development.

Website performance optimization uses XHProf, which is a class library open sourced by Facebook. This framework is very easy to personalize and configure, and can cache most expensive server calculations.

 MySQL

MySQL is the main RDBMS for the website. The website has several MySql servers: a 4CPU, 32GB server stores user-related information, such as basic information, photo description information, etc. This machine has been used for 4 years, and the next step is to replace it with a shared cluster. The design is still based on this system to simplify the data access code. Data partitioning is based on user ID, because most of the information in the website is user-centered, such as photos, videos, messages, etc.

There are three servers providing user forum services based on a master-slave-slave configuration architecture. A slave server is responsible for storing custom messages on the website, and there are currently 250 million messages. The other four machines are in a master-slave configuration. In addition, four machines are configured into an NDB cluster to specifically serve intensive write operation data, such as user access statistics.

The data table design should try to avoid association operations and cache as much data as possible. Of course, the structural specifications of the database have been completely destroyed. Therefore, to make searching easier, database design creates data mining tables. Most of the tables are MyISAM-type tables, which can provide fast search. The problem now is that more and more tables have been fully locked. Poppen.de is considering migrating to the XtraDB storage engine.

Memcached

There are quite a lot of Memcached applications in the website architecture, with more than 45GB of cache and 51 nodes. Session, view cache, function execution cache, etc. are cached. There is a system in the architecture that automatically updates the data to the cache when records are modified. Possible solutions to improve cache updates in the future are to use the new Redis Hash API or MongoDB.

RabbitMQ

Started using RabbitMQ in the architecture in mid-2009. This is a good messaging solution that is easy to deploy and centralize into this architecture, running two RabbitMQ servers behind LVS. In the last month, more things have been integrated into the queue, meaning that at one time there were 28 PHP servers handling 500,000 requests per day. Send logs, email notifications, system messages, image uploads, and more to this queue.

Use the fastcgi_finish_request() function in PHP-FPM to integrate queue messages and send messages to the queue asynchronously. This function is called when the system needs to send an HTML or JSON format response to the user, so that the user does not have to wait for the PHP script to clean up.

This system can improve architectural resource management. For example, during peak periods the service can handle 1,000 login requests per minute. This means that there are 1000 concurrent updates to the user table to save the user's login time. Thanks to the queuing mechanism, these queries can be run in reverse order. If you need to increase the processing speed, you only need to add more queue processors, and you can even add more servers to the cluster without modifying any configuration or deploying new nodes.

CouchDB

Log storage CouchDB runs on a machine. Log query/grouping can be done on this machine based on module/behavior, or based on error type, etc. This is very useful for locating problems. Before using the log aggregation service CouchDB, I had to log in to the PHP servers one by one to try to analyze the logs and locate the problem, which was very troublesome. Now all the logs are concentrated in the queue and saved in CouchDB, so that problem inspection and analysis can be carried out centrally.

Graphite

The website uses Graphite to collect real-time website information and statistics. From requesting every module/behavior to Memcached hits and misses, RabbitMQ status monitoring, Unix loads and more. The Graphite service has an average of 4,800 update operations per minute. Practice has proven to be very useful for monitoring what is happening on the website, and its simple text protocol and drawing functions can be easily used in a plug-and-play manner on any system that needs to be monitored.

One cool thing is using Graphite to monitor two versions of the website at the same time. A new version of the Symfony framework was deployed in January, with the previous code deployed as a backup. This means the website may face performance issues. Therefore, Graphite can be used to compare the two versions online.

Found that the Unix load table on the new version was higher, so I used XHProf to perform performance analysis on the two versions to find out the problem.

Red5

The website also provides two types of video services for users, one is videos uploaded by users themselves, and the other is video chat, where users interact and share videos. By mid-2009, it will provide users with 17TB of traffic services every month.

 Tsung

Tsung is a distributed benchmark analysis tool written in Erlang. On the Poppen.de website, it is mainly used for HTTP benchmark analysis and comparative analysis of MySQL and other storage systems (XtraDB). A system was used to record the traffic of the main MySQL server and then converted it into Tsung's baseline session. The traffic is then replayed, and Tsung generates thousands of concurrent users accessing the laboratory's servers. This allows the experimental environment to be very close to the real scene.

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/371668.htmlTechArticleAfter learning about the world’s largest PHP site, Facebook’s backend technology, today we will learn about a million-level PHP Website architecture of the site: Poppen.de. Poppen.de is a social networking site in Germany...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn