Home  >  Article  >  Java  >  Interviewer: How much do you know about high concurrency? Me: emmm...

Interviewer: How much do you know about high concurrency? Me: emmm...

Java学习指南
Java学习指南forward
2023-07-26 16:07:261039browse

Interviewer: How much do you know about high concurrency? Me: emmm...

High concurrency is an experience that almost every programmer wants to have. The reason is very simple: as the traffic increases, various technical problems will be encountered, such as interface response timeout, increased CPU load, frequent GC, deadlock, and large data storage Wait, these questions can drive us to continuously improve our technical depth.

In past interviews, if the candidate has done high-concurrency projects, I usually ask the candidate to talk about their understanding of high-concurrency, but I can answer them systematically. There are not many people who are good at this problem. They can be roughly divided into the following categories:

1, There is no data-based indicators Concept: Not sure what indicators to choose to measure high-concurrency systems? I can't tell the difference between concurrency and QPS, and I don't even know the total number of users of my system, the number of active users, QPS and TPS during flat and peak times, and other key data.

2. Some plans were designed, but the details were not thoroughly grasped: Can’t tell what to do Plan should focus on the technical points and possible side effects. For example, if there is a bottleneck in read performance, caching will be introduced, but issues such as cache hit rate, hot key, and data consistency are ignored.

3. One-sided understanding, equating high concurrency design with performance optimization: talks about concurrent programming, multi-level caching, asynchronousization, and horizontal expansion, but ignores high concurrency design Available design, service governance and operation and maintenance assurance.

4. Master the big plan, but ignore the most basic things: Be able to clearly explain vertical layering, There are big ideas such as horizontal partitioning and caching, but I have no intention to analyze whether the data structure is reasonable and whether the algorithm is efficient. I have never thought about optimizing details from the two most fundamental dimensions of IO and computing.

In this article, I want to combine my experience in high-concurrency projects to systematically summarize the knowledge and practical ideas that need to be mastered in high-concurrency. I hope it will be helpful to you. . The content is divided into the following 3 parts:

  • How to understand high concurrency?
  • #What is the goal of high-concurrency system design?
  • What are the practical solutions for high concurrency?

01 How to understand high concurrency?

High concurrency means large traffic, and technical means need to be used to resist the impact of traffic. These means are like operating traffic, allowing the traffic to be processed by the system more smoothly and bringing better results to users. experience.

Our common high-concurrency scenarios include: Taobao’s Double 11, ticket grabbing during the Spring Festival, hot news from Weibo Vs, etc. In addition to these typical things, flash sale systems with hundreds of thousands of requests per second, order systems with tens of millions of orders per day, information flow systems with hundreds of millions of daily actives per day, etc., can all be classified as high concurrency.

Obviously, in the high concurrency scenarios mentioned above, the amount of concurrency varies. So how much concurrency is considered high concurrency?

##1. Don’t just look at numbers, but also look at specific business scenarios. It cannot be said that the flash sale of 10W QPS is high concurrency, but the information flow of 1W QPS is not high concurrency. The information flow scenario involves complex recommendation models and various manual strategies, and its business logic may be more than 10 times more complex than the flash sale scenario. Therefore, they are not in the same dimension and have no comparative meaning.

2. Business is started from 0 to 1. Concurrency and QPS are only reference indicators. The most important thing is: when the business volume gradually increases to 10 times or 100 times, In the process of doubling, have you used high-concurrency processing methods to evolve your system, and prevent and solve problems caused by high concurrency from the dimensions of architecture design, coding implementation, and even product solutions? Instead of blindly upgrading hardware and adding machines for horizontal expansion.

In addition, the business characteristics of each high-concurrency scenario are completely different: there are information flow scenarios with more reading and less writing, and there are transaction scenarios with more reading and writing, Is there a general technical solution to solve high concurrency problems in different scenarios?

I think we can learn from the big ideas and other people’s plans, but in the actual implementation process, there will be countless pitfalls in the details. In addition, since the software and hardware environment, technology stack, and product logic cannot be completely consistent, these will lead to the same business scenario. Even if the same technical solution is used, different problems will be faced, and these pitfalls have to be overcome one by one.

Therefore, in this article I will focus on basic knowledge, general ideas, and effective experiences that I have practiced, hoping to make You have a deeper understanding of high concurrency.

#02 What is the goal of high-concurrency system design?

First clarify the goals of high-concurrency system design, and then discuss the design plan and practical experience on this basis to be meaningful and targeted.

2.1 Macro-goal

##High concurrency does not mean only pursuing high Performance, this is a one-sided understanding of many people. From a macro perspective, there are three goals for high concurrency system design: high performance, high availability, and high scalability.

1. High performance: Performance reflects the parallel processing capability of the system. With limited hardware investment, improvesPerformance means saving cost. At the same time, performance also reflects user experience. The response times are 100 milliseconds and 1 second respectively, which give users completely different feelings.

2. High availability : indicates the time when the system can serve normally. One has no downtime and no faults all year round; the other has online accidents and downtime every now and then. Users will definitely choose the former. In addition, if the system can only be 90% available, it will greatly hinder the business.

3. High expansion : Indicates the expansion capability of the system, whether it can be extended in a short time during peak traffic times Complete capacity expansion and more smoothly handle peak traffic, such as Double 11 events, celebrity divorces and other hot events.

Interviewer: How much do you know about high concurrency? Me: emmm...

These three goals need to be considered comprehensively, because they are interrelated and even affect each other.

thanlike sayingConsidering the scalability of the system, you will design the service to be stateless,This kind of cluster is designed to ensure high scalability. In fact, also occasionallyupgrades the system Performance and usability.

Another example: In order to ensure availability, timeout settings are usually set for service interfaces to prevent a large number of threads from blocking slow requests and causing a system avalanche. So what is a reasonable timeout setting? Generally, we will make settings based on the performance of dependent services.

2.2 Micro Goals

Let’s look at it from a micro perspective Look, what are the specific indicators to measure high performance, high availability and high scalability? Why were these indicators chosen?

Performance Indicators

Performance indicators can be used to measure the current Performance issues, and also serve as the basis for evaluation of performance optimization. Generally speaking, the interface response time within a period of time is used as an indicator.

1. Average response time: most commonly used, but has obvious flaws and is insensitive to slow requests. For example, if there are 10,000 requests, of which 9,900 are 1ms and 100 are 100ms, the average response time is 1.99ms. Although the average time consumption has only increased by 0.99ms, the response time for 1% of requests has increased 100 times.

2, TP90, TP99 and other quantile values: Sort the response time from small to large, TP90 means ranking at the 90th point Bit response time, the larger the quantile value, the more sensitive it is to slow requests.

Interviewer: How much do you know about high concurrency? Me: emmm...

3. Throughput: It is inversely proportional to the response time. For example, the response time is 1ms. Then the throughput is 1000 times per second.

Usually, when setting performance goals, both throughput and response time will be taken into consideration, such as this: at 1 per second Under 10,000 requests, AVG is controlled below 50ms, and TP99 is controlled below 100ms. For high-concurrency systems, AVG and TP quantile values ​​must be considered at the same time.

In addition, from the perspective of user experience, 200 milliseconds is considered the first dividing point, and users cannot feel the delay. 1 second is the second dividing point, and users can feel it. Delay, but acceptable.

Therefore, for a healthy high-concurrency system, TP99 should be controlled within 200 milliseconds, and TP999 or TP9999 should be controlled within 1 second. Within.

❇ Availability Indicators

High availability refers to the system It has high fault-free operation capability. Availability = normal operation time / total system operation time. Several 9s are generally used to describe the availability of the system.

Interviewer: How much do you know about high concurrency? Me: emmm...

For high-concurrency systems, the most basic requirement is: guarantee 3 9s or 4 9s. The reason is simple. If you can only achieve two nines, it means there is a 1% failure time. For example, some large companies often have more than 100 billion in GMV or revenue every year. 1% is a business impact of 1 billion level.

❇ ScalabilityIndicators

In the face of sudden traffic, it is impossible to temporarily transform the architecture. The fastest The way is to add machines to linearly increase the processing power of the system.

For business clusters or basic components, scalability = performance improvement ratio / machine addition ratio. The ideal scalability is: Resources are increased several times and performance is improved several times. Generally speaking, the expansion capability should be maintained above 70%.

But from the perspective of the overall architecture of a high-concurrency system, the goal of is not just to extend the service It is enough to design it stateless, because when the traffic increases by 10 times, the business service can quickly expand 10 times, but the database may become a new bottleneck.

Stateful storage services like MySQL are usually technically difficult to expand. If the architecture is not planned in advance (vertical and horizontal splitting), This will involve the migration of a large amount of data.

Therefore, high scalability needs to be considered: service clusters, middleware such as databases, caches and message queues, load balancing, bandwidth, dependent third parties, etc. When concurrency reaches a certain amount Each of these factors can become a bottleneck for scaling later.


##

03 What are the practical solutions for high concurrency??
After understanding the three major goals of high concurrency design, we will systematically summarize the high concurrency design plan, which will be expanded from the following two parts: first, summarize the general design methods, and then Then specific practical solutions are given around high performance, high availability, and high scalability.
3.1 Universal design method
Universal design The method mainly starts from the two dimensions of vertical and horizontal, commonly known as the two pillars of high concurrency processing: vertical expansion and horizontal expansion.

❇ Vertical expansion (scale-up)

Its goal is to improve the processing capabilities of a single machine, The plan also includes:

1. Improve the hardware performance of a single machine: by increasing the memory, CPU core number, storage capacity, or upgrading the disk to SSD, etc. The method waytoimprove . 2. Improve the software performance of a single machine: use cache to reduce the number of IOs, and use concurrent or asynchronous methods to increase throughput.
❇ Horizontal expansion (scale-out)

Because there will always be a limit to the performance of a single machine, it is ultimately necessary to Introducing horizontal expansion and further improving concurrent processing capabilities through cluster deployment, including the following two directions:

1. Do a good job of layered architecture: This is the advance of horizontal expansion, because high Concurrent systems often have complex businesses, and layered processing can simplify complex problems and make it easier to expand horizontally.

Interviewer: How much do you know about high concurrency? Me: emmm...

The above diagram is the most common layered architecture on the Internet. Of course, the real high-concurrency system architecture will be further improved on this basis. For example, dynamic and static separation will be done and CDN will be introduced. The reverse proxy layer can be LVS Nginx, the Web layer can be a unified API gateway, the business service layer can be further micro-serviced according to vertical business, and the storage layer can be various heterogeneous databases.

#2. Horizontal expansion of each layer: stateless horizontal expansion, stateful shard routing. Business clusters can usually be designed to be stateless, while databases and caches are often stateful. Therefore, partition keys need to be designed for storage sharding. Of course, read performance can also be improved through master-slave synchronization and read-write separation.

3.2 Specific practical plan
Combined with my personal experience, for From the three aspects of high performance, high availability, and high scalability, we summarize practical solutions that can be implemented.

❇ High-performance practical solution

1. Cluster deployment reduces the pressure on a single machine through load balancing.

2. Multi-level caching, including the use of CDN, local cache, distributed cache, etc. for static data, as well as the processing of hot keys, cache penetration, cache concurrency, data consistency and other issues in cache scenarios.
3. Database sharding, table sharding and index optimization, as well as solving complex query problems with the help of search engines.
4. Consider using NoSQL databases, such as HBase, TiDB, etc., but the team must be familiar with these components and have strong operation and maintenance capabilities.
#5. Asynchronous, process secondary processes asynchronously through multi-threading, MQ, or even delayed tasks.
6, Current Limiting, you need to first consider whether the business allows current limiting (for example, flash sale scenarios are allowed ), including front-end current limiting, Nginx access layer current limiting, and server-side current limiting.
7. Peak-cutting and valley-filling of the traffic, through MQ accepts traffic.
8. Concurrent processing, parallelize serial logic through multi-threading.
9. Pre-calculation, such as grabbing red envelope scenarios, you can calculate the red envelope amount in advance and cache it, and use it directly when sending red envelopes.
10, Cache warm-up, through asynchronous TaskadvancePreheat data to local cache or distributed cache.
11. Reduce the number of IOs, such as database and cache batch read and write, RPC batch interface support, or eliminate RPC calls through redundant data.
12. Reduce the size of data packets during IO, including using lightweight communication protocols, appropriate data structures, removing redundant fields in interfaces, reducing cache key size, and compression Cache value, etc.
13. Program logic optimization, such as pre-positioning the judgment logic that has a high probability of blocking the execution process, optimizing the calculation logic of the For loop, or using a more efficient algorithm.
14. The use of various pooling technologies and the setting of pool size, including HTTP request pool, thread pool (consider CPU-intensive or IO-intensive to set core parameters), database and Redis connection pool, etc.
15. JVM optimization, including the size of the new generation and old generation, the choice of GC algorithm, etc., to reduce GC frequency and time-consuming as much as possible.
#16. Lock selection, use optimistic locking in scenarios where there is more reading and less writing, or consider reducing lock conflicts through segmented locking.

The above solution is nothing more than considering all possible optimization points from the two dimensions of computing and IO. It requires a supporting monitoring system to understand the current performance in real time and support you Carry out performance bottleneck analysis, and then follow the 28/20 principle to focus on the main contradictions for optimization.

❇ High-availability practical solution

#1. Failover of peer nodes. Both Nginx and the service governance framework support the failure of a node. Visit another node.

2. Failover of non-peer nodes, through heartbeat detection and implementation of master-slave switchover (such as redis's sentinel mode or cluster mode, MySQL's master-slave switchover, etc.).
3. Interface-level timeout settings, retry strategies and idempotent design.
4. Downgrade processing: ensure core services, sacrifice non-core services, circuit break when necessary; or when there is a problem with the core link, there is an alternative link .
5. Current limiting processing: Directly reject or return error codes for requests that exceed the system's processing capabilities.
6. Message reliability guarantee in MQ scenarios, including retry mechanism on the producer side, persistence on the broker side, ack mechanism on the consumer side, etc.
7. Grayscale release can support small traffic deployment according to the machine dimension, observe system logs and business indicators, and then push the full volume after the operation is stable.
8. Monitoring and alarming: a comprehensive monitoring system, including the most basic monitoring of CPU, memory, disk, and network, as well as Web server, JVM, database, and various middleware Monitoring and monitoring of business metrics.
9. Disaster recovery drill: Similar to the current "chaos engineering", use some destructive methods on the system to observe whether local failures will cause availability problems.

High-availability solutions are mainly considered from three directions: redundancy, trade-offs, and system operation and maintenance. At the same time, they need to have supporting duty mechanisms and fault handling processes. When online problems occur, Can be followed up in time.

❇ Highly scalable practical solution

1. Reasonable layered architecture: For example, the most common ones on the Internet mentioned above Layered architecture, in addition, microservices can be further layered in a more fine-grained manner according to the data access layer and business logic layer (but performance needs to be evaluated, and there may be one more hop in the network).

#2. Splitting of the storage layer: vertical splitting according to business dimensions, and further horizontal splitting (sub-database and sub-table) according to data feature dimensions.
#3. Splitting the business layer: The most common is splitting according to business dimensions (such as commodity services, order services, etc. in e-commerce scenarios), or you can It can be split according to the core interface and non-core interface, and can also be split according to the request source (such as To C and To B, APP and H5).


Final words

High concurrency is indeed a complex and systemic problem. Due to limited space, things such as distributed Trace, full-link stress testing, and flexible transactions are all is a technical point to consider. In addition, if the business scenarios are different, the high-concurrency implementation solutions will also be different, but the overall design ideas and the solutions that can be used for reference are basically similar.

High concurrency design must also adhere to the three principles of architectural design: Simplicity, appropriateness and evolution. "Premature optimization is the root of all evil", cannot be divorced from the actual situation of the business, and do not over-design , The appropriate solution is the most perfect.

I hope this article can give you a more comprehensive understanding of high concurrency. If you also have experience and in-depth thinking that you can learn from, please leave a message in the comment area for discussion.

The above is the detailed content of Interviewer: How much do you know about high concurrency? Me: emmm.... For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:Java学习指南. If there is any infringement, please contact admin@php.cn delete