Optimization experience of a production accident-PHP Tutorial-php.cn

Home

Backend Development

PHP Tutorial

Optimization experience of a production accident

PHPz

Mar 12, 2017 pm 04:24 PM

After a normal event promotion, customer service began to give feedback one after another. Users reported that they could not open the webpage or APP when grabbing bids. When they opened it, the bids had already been snatched up. They were not particularly interested at first. I felt Isn’t that what it’s like when competing for bids, and isn’t that what it’s like when competing for Xiaomi phones? As the event continued, more users protested strongly. Users who received interest rate coupons or cash coupons were unable to grab the bids, believing that the platform was fraudulent and deliberately prevented them from being used to save resources.

Analysis process

In fact, there have been continuous user feedbacks in the past that did not decrease, and customers were deceived by using Xiaomi to grab mobile phones as an example. This time the user feedback was too strong, so we paid attention to it. got up. We have a total of three front-end products, app, official website, and H5. Among them, the app is used the most, and the official website is second. H5 is rarely used in daily life, but the traffic will increase sharply during events (events are usually mostly H5 games, and H5 is also convenient for promotion and marketing. ), the three front-end products all use lvs to load into the two back-end webservice servers (as shown below). This time the user feedback is basically on the web and app sides, so focus on observing these four servers. server.

Optimization experience of a production accident

First of all, I suspected whether the network bandwidth was full, and found a network engineer to monitor it through Tools. During the bidding process, the maximum bandwidth usage was only about 70%. , and then rule it out; I once again doubted whether the web server could no longer withstand it. Use the top command to check the load of the two servers on the official website. At the moment of bidding, it will soar to about 6-8, and it will slowly increase after the bidding. It returned to normal, and the two servers of the app peaked at 10-12, and then returned to normal.

Tracked the web server business log and found that the database Update layer reported that no new database connections could be requested or the database connections had been used up. It was thought that the maximum number of connections in the database was too small, so adjustments were made. mysql databaseThe maximum number of connections is 3 times that of the past; I will continue to observe the business log when bidding next time and find that errors related to database links are no longer reported, but many users still report that the page cannot be opened during bidding. .

Continue to track the web server, use the command (ps -ef|grep httpd|wc -l) when bidding to check the number of httpd connections, which is about 1,000, and randomly check apacheThe maximum number of connections set in the configuration file is 1024 (apache’s default maximum number of connections is 256). It turns out that the number of connections during the bidding process has reached the maximum number of connections. Many users have been unable to obtain http connections during the bidding process. As a result, the page becomes unresponsive or the app keeps waiting. So adjust the maximum number of connections in the apache configuration file to 1024*3.

Continue to observe during the bidding process, the number of Apache connections can still soar to between 2600-2800 during the bidding process. According to customer service feedback, there are still many users reporting the problem of bidding, but it is slightly better than before. A little, but there are sporadic user feedbacks that they have already grabbed the target, and finally it was rolled back. Then continue to observe the database server, use the top command and MySQL Workbench to view the various loads of the mysql main library and the slave library. I was shocked (as shown below). The indicators of the mysql server main library have reached their peak, while the slave library is almost not too big. pressure.

Optimization experience of a production accident

The tracking code found that all the business codes at the three ends were connected to the main library, and only the query business in the background was used in the slave library, so the transformation was started immediately; Except for queries during the bidding process, all queries on other pages or businesses were transformed into queries on the slave database. After the transformation, we found that the pressure on the master database was significantly reduced, and the pressure on the slave database began to increase. As shown below:

Optimization experience of a production accident

#According to the feedback from customer service, after the transformation, the problem of the bid being returned is almost gone. During the bidding process, the page cannot be opened or is opened slowly. It has been alleviated to a certain extent, but some users still report this problem. According to the analysis results of the above projects, we can conclude that:

1 The two servers under load have reached the processing limit and more configurations are required. server to load.
2 The pressure on the mysql main database has been significantly reduced, but the pressure on the slave database has increased. It is necessary to change the current one master and one slave to one master and multiple slaves model.
3 To completely solve these problems, we need to comprehensively consider the overall optimization of the platform, such as: business optimization (removing hot spots in the business), increasing caching, and paginationfacestatic (you can use the front-end optimization rules of Yahoo and Google, and there are many test websites on the Internet for evaluation) and so on.

I wrote an optimization report based on these circumstances, see below:

Optimization Report

1 Background

With the continuous development of the company's business, the business volume and user volume have surged. The official website pv has also increased from the initial xxx-xxx to the current xxx-xxxx, and the active users of the APP have increased significantly; therefore, it has also affected the current platform's TechnologyArchitecture has greater challenges. Especially when the platform's bid sources are tight recently, the time to complete the bid is getting shorter and shorter. The pressure on servers is also increasing; therefore, the current system architecture needs to be upgraded to support a larger number of users and business volumes.

2 User access diagram

Optimization experience of a production accident

Currently, the platform has three products facing users, the platform official website, platform APP, and platform small webpage; among them, the platform official website and platform APP The pressure is relatively high.

3 Existing problems

The problems when users compete for bids are concentrated in the following aspects
1. The webpage or APP cannot be opened
2. The website or APP is slow to open
3. After the transfer was successful during the bidding process, the update failed due to the heavy pressure on the server, and the refund was issued again.
4. The number of database connections was exhausted, resulting in the failure to add investment records after the bidding was full, and the progress of the bidding was rolled back.

4. Analysis

Through in-depth analysis of recent server parameters, concurrency, and system logs, it is concluded that:
1. The server pressure is huge during the bidding process of the platform's official website and platform APP. Among them, the problem of platform APP is more prominent. During the peak period of bidding, the maximum number of apache connections for a single APP server has been close to 2600, which is close to the maximum processing capacity of apache. 2. The database server is under huge pressure. The pressure on the database is mainly prominent in two periods

1) When the platform is doing activities, the number of visits to the official website, small web pages, and APPs increases dramatically, resulting in a huge increase in data query volume. When the database processing limit is reached, problems will occur. Problems such as slow website opening;

2) When users compete for bids, the pressure on users to compete for bids is divided into two stages: before bidding and during bidding. Before bidding, because the bidding is full very quickly, users open the bidding page in advance and refresh it continuously. This will increase the query pressure on the database. If the number of users competing for bids is very large, the number of database connections will be used up before bidding. ; During the bidding process, a single purchase will probably involve about 15 tables for change and query. Each bid has a share of 10 million, and about 100-200 people will purchase and complete the full bid each time. Calculated based on the median value of 150 people, in a few seconds The data needs to be updated 2000-
300
0 times within a period of time (only updates, excluding queries), resulting in a large amount of concurrency, which may cause update failures or connection timeouts, thus affecting user bidding and normal system fullness. mark. 5 Solution

1. Web server solution

Schematic diagram of a single user accessing web services

Optimization experience of a production accident Current website and platform The APP uses two services for balanced responsibility. Each server has

installed

apache for server-side processing. Each apache can handle a maximum of about 2,000 connections. Therefore, in theory, the current website or APP can handle more than 4,000 user requests. If you want to support 10,000 requests at the same time, you need 5 apache servers to support it, so you currently lack 6 web servers. Access diagram after upgrading the server

Optimization experience of a production accident 2. Database solution

Current database deployment plan

Optimization experience of a production accident 1) Master-slave Separately solves 80% of the query pressure of the main database. At present, the official website and APP of the platform are connected to the MySQL main database, which doubles the pressure on the main database. Migrating all queries in the service to the slave database can greatly reduce the pressure on the main database.

2) Add a cache server. When the slave database query reaches its peak, it will also affect the master-slave synchronization, thereby affecting transactions. Therefore, queries frequently used by users are cached to reduce the request pressure on the database. It is necessary to

add

three cache servers to build a redis cluster.

3. Other optimizations
1) The homepage of the official website is static. According to cnzz statistics, the homepage accounts for about 15% of the total visits to the website. Data that does not change frequently on the homepage are processed statically to improve The smoothness of opening the official website.

2) Optimize the apache server, enable gzip compression, configure a reasonable number of links, etc.

3) Remove the update hotspot in the investment process: the target schedule. Each time a bid succeeds or fails, the bid schedule needs to be updated. Problems such as optimistic locking may occur during multi-thread updates. Eliminate updates during the process and only save the bid progress information in the bid schedule after the bid is full, optimizing the pressure on the database during the investment process.

6 Server upgrade plan

1. The biggest pressure on the platform comes from the database. It is necessary to change the current one master and one slave to one master and four slaves. A large number of queries generated by the official website/app/small webpage are distributed to three slave databases by virtual IP, and the background management queries go to another slave database. The database needs to add three new servers
Schematic diagram after database upgrade
Optimization experience of a production accident

2. Increase cache to reduce data pressure. Two new cache servers with large memory need to be added
Optimization experience of a production accident

3. Three new web servers need to be added to decompose user access requests.

The app needs to add two new servers.
The pressure on the app server during the bidding process Maximum, two new servers need to be added. Schematic diagram after the configuration is completed
Optimization experience of a production accident

The official website needs to add one new server
The official website also has certain requirements in the bidding process Pressure requires a new server. The completed diagram is as follows:
Optimization experience of a production accident

In total, 8 servers need to be purchased, two of which require large memory (64G or more)

Click to download the optimization report word version

Note: After all optimization plans are put into production, the problems will be solved and there will be no bids. worry!

##Author: Pure Smile
Source: http://www.php.cn/
Copyright belongs to the author, please indicate the source when reprinting.

The above is the detailed content of Optimization experience of a production accident. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

The Continued Use of PHP: Reasons for Its EnduranceApr 19, 2025 am 12:23 AM

What’s still popular is the ease of use, flexibility and a strong ecosystem. 1) Ease of use and simple syntax make it the first choice for beginners. 2) Closely integrated with web development, excellent interaction with HTTP requests and database. 3) The huge ecosystem provides a wealth of tools and libraries. 4) Active community and open source nature adapts them to new needs and technology trends.

PHP and Python: Exploring Their Similarities and DifferencesApr 19, 2025 am 12:21 AM

PHP and Python are both high-level programming languages that are widely used in web development, data processing and automation tasks. 1.PHP is often used to build dynamic websites and content management systems, while Python is often used to build web frameworks and data science. 2.PHP uses echo to output content, Python uses print. 3. Both support object-oriented programming, but the syntax and keywords are different. 4. PHP supports weak type conversion, while Python is more stringent. 5. PHP performance optimization includes using OPcache and asynchronous programming, while Python uses cProfile and asynchronous programming.

PHP and Python: Different Paradigms ExplainedApr 18, 2025 am 12:26 AM

PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

PHP and Python: A Deep Dive into Their HistoryApr 18, 2025 am 12:25 AM

PHP originated in 1994 and was developed by RasmusLerdorf. It was originally used to track website visitors and gradually evolved into a server-side scripting language and was widely used in web development. Python was developed by Guidovan Rossum in the late 1980s and was first released in 1991. It emphasizes code readability and simplicity, and is suitable for scientific computing, data analysis and other fields.

Choosing Between PHP and Python: A GuideApr 18, 2025 am 12:24 AM

PHP is suitable for web development and rapid prototyping, and Python is suitable for data science and machine learning. 1.PHP is used for dynamic web development, with simple syntax and suitable for rapid development. 2. Python has concise syntax, is suitable for multiple fields, and has a strong library ecosystem.

PHP and Frameworks: Modernizing the LanguageApr 18, 2025 am 12:14 AM

PHP remains important in the modernization process because it supports a large number of websites and applications and adapts to development needs through frameworks. 1.PHP7 improves performance and introduces new features. 2. Modern frameworks such as Laravel, Symfony and CodeIgniter simplify development and improve code quality. 3. Performance optimization and best practices further improve application efficiency.

PHP's Impact: Web Development and BeyondApr 18, 2025 am 12:10 AM

PHPhassignificantlyimpactedwebdevelopmentandextendsbeyondit.1)ItpowersmajorplatformslikeWordPressandexcelsindatabaseinteractions.2)PHP'sadaptabilityallowsittoscaleforlargeapplicationsusingframeworkslikeLaravel.3)Beyondweb,PHPisusedincommand-linescrip

How does PHP type hinting work, including scalar types, return types, union types, and nullable types?Apr 17, 2025 am 12:25 AM

PHP type prompts to improve code quality and readability. 1) Scalar type tips: Since PHP7.0, basic data types are allowed to be specified in function parameters, such as int, float, etc. 2) Return type prompt: Ensure the consistency of the function return value type. 3) Union type prompt: Since PHP8.0, multiple types are allowed to be specified in function parameters or return values. 4) Nullable type prompt: Allows to include null values and handle functions that may return null values.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

3 weeks agoByDDD

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

1 months agoByDDD

Roblox: Dead Rails - How To Complete Every Challenge

3 weeks agoByDDD

Hot Tools

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

Hot Topics

Where is the login entrance for gmail email?

7605

CakePHP Tutorial

1387

What is the format of the account name of steam

win11 activation key permanent

nyt connections hints and answers

132