Home >Database >Mysql Tutorial >Mysql large website technical architecture core case analysis

Mysql large website technical architecture core case analysis

WBOY
WBOYforward
2023-05-27 14:31:501224browse

7. On demand: scalable architecture of the website

Extensibility (Extensibility):refers to the system with minimal impact on the existing system. The ability to continuously expand or improve functionality. It is the opening and closing principle at the system architecture design level. The architecture design takes into account future functional expansion. When new functions are added to the system, there is no need to modify the structure and code of the existing system.

Scalability (Scalability): refers to the system's ability to enhance (reduce) its own computing and processing capabilities by increasing (decreasing) the scale of its own resources.

A. Build a scalable website architecture

1. The greatest value of a software architect does not lie in how many advanced technologies he has mastered, but in his ability to cut a large system into pieces. The ability to divide it into N low-coupling sub-modules. These sub-modules include horizontal business modules and vertical basic technology modules.

2. The core idea is modularization. On this basis, the coupling between modules is reduced and the reusability of modules is improved.

B. Use distributed message queues to reduce system coupling

1. Event-driven architecture

  • Event-driven architecture ( Event Driven Architecture): By transmitting event messages between low-coupled modules to maintain loose coupling of modules, and completing inter-module cooperation with the help of event message communication, distributed message queues are commonly used.

  • #The message queue works using the publish-subscribe model. The message sender publishes the message and one or more message receivers subscribe to the message.

2. Distributed message queue

  • The queue is a first-in-first-out structure, and the application can access the interface through the remote Use distributed message queues to perform message access operations to achieve distributed asynchronous calls.

  • The message producer application pushes the message to the message queue server through the remote access interface. The message queue server writes the message to the local memory queue and immediately returns a successful response to the message producer. By. The message queue server searches for the message consumer application that subscribes to the message based on the message subscription list, and sends the messages in the message queue to the message consumer program through the remote communication interface according to the first-in-first-out (FIFO) principle.

  • Distributed message queues can be very complex. For example, they can support ESB (Enterprise Service Bus) and SOA (Service-Oriented Architecture), or they can also be very simple using MySQL records: The message producer program writes messages into the database as data records, and the message consumer program queries the database and sorts them by the record writing timestamp, thus realizing a de facto distributed message queue.

C. Use distributed services to create a reusable business platform

1. Distributed services decompose system coupling through interfaces , different subsystems make service calls through the interface description of Desert Rose.

2. Problems with the Big Mac system: difficulty in compilation and deployment; difficulty in code branch management; exhaustion of database connections; difficulty in adding new services;

3. Solution

  • Vertical split: split a large application into multiple small applications

  • Horizontal split: split the reused business, Deployed independently as distributed services, new businesses only need to call these distributed services and do not need to rely on specific module codes

4. Web Service and enterprise-level distributed services

Disadvantages: bloated registration and discovery mechanism; inefficient XML serialization means; relatively high overhead HTTP remote communication; complex deployment and maintenance means;

5. Large website distribution Requirements and characteristics of distributed services

Load balancing, failover, efficient remote communication, integration of heterogeneous systems, minimal intrusion into applications, version management, real-time monitoring

6. Distributed service framework Design: Thrift, Dubbo

D. Extensible data structure

Designed using ColumnFamily (column family) used in NoSQL databases.

E. Use an open platform to build a website ecosystem

1. The open platform is the interface for internal and external interaction of the website, and the external side needs to face the large number of third-party developers Or, we need to face many business services within the website internally.

2. Architecture: API interface, protocol conversion, security, auditing, routing, process

8. Impregnable: Security architecture of the website

A. Website application attack and defense

1.XSS attack

  • An attack method is to tamper with web pages, inject malicious HTML scripts, and control the user's browser to perform malicious operations when the user browses the web.

  • One type of attack is the reflection type. The attacker induces the user to click on a link embedded with a malicious script to achieve the purpose of the attack

  • Another type of attack is a persistent XSS attack. The hacker submits a request containing a malicious script and saves it in the database of the attacked website. When the user browses the web page, the malicious script is included in the normal page, achieving the purpose of the attack. Purpose. It is often used in web applications such as forums and blogs.

  • Preventative measures include disinfecting and filtering dangerous characters, while prohibiting page JS from accessing Cookies with HttpOnly attributes

2. Injection attack

  • It is divided into SQL injection and OS injection

  • ##SQL injection to obtain the database structure: using open source software programs, error echo, blind injection


  • SQL injection prevention: disinfection; parameter binding, use pre-compilation method, bind parameters;


3. CSRF attack

  • CSRF (Cross Site Request Forgery), the attacker performs illegal operations as a legitimate user through cross-site requests. The main method is to use cross-site requests to forge requests with the user's identity without the user's knowledge, and use browser cookies or server session policies to steal the user's identity.


  • Prevention: form token, verification code, Referer check (check the request source recorded in the Referer field of the HTTP request header)


4. Other attack vulnerabilities

  • Error Code: error echo, HTML comments, file upload, path traversal


  • # #5.Web application firewall: ModSecurity

6. Website security vulnerability scanning

B. Information encryption technology and key security management

1. One-way hash encryption: md5, sha, etc., add salt

2. Symmetric encryption: DES algorithm, RC algorithm, etc., use the same key for encryption

3. Asymmetric encryption: RSA Algorithm

4. Key security management

    By placing the key and algorithm in an independent server or dedicated hardware device, and realizing data encryption and decryption through service calls .

  • Put the decryption algorithm in the application system and the key in an independent server. During actual storage, the key is divided into several pieces and encrypted and stored separately in In different storage media, performance is improved while taking into account key security.

C. Information filtering and anti-spam

1. Text matching: solving the problem of sensitive word filtering

    Just use regular replacement for a small amount of content

  • When there are many words and high concurrency, use the Trie tree algorithm (double array Trie algorithm)

  • Construct a Hash table for text matching

  • #Sometimes it is necessary to perform noise reduction processing, such as "Arab_Arab_ ”

  • 2. Classification algorithm: Bayesian algorithm, TAN algorithm, ARCS algorithm

3. Blacklist: Hash table, Bloom filter

D. E-commerce risk control

1. Risks: account risk, buyer risk, seller risk, transaction risk

2.Risk control

    The machine automatically identifies high-risk transactions and information and sends it to risk control auditors for manual review. The technology and methods of machine risk control are constantly being gradually improved through new risk types discovered manually.

  • Rule engine: When certain indicators of a transaction meet certain conditions, it will be considered to have a high risk of fraud.

  • #Statistical model: Use classification algorithms or more complex machine learning algorithms to perform intelligent statistics. The classification algorithm is trained based on the fraudulent transaction information in historical transactions, and then the collected and processed transaction information is input into the classification algorithm to obtain the transaction risk score.

9. Case analysis of Taobao’s architecture evolution

1.LAMP->JAVA/ORACLE->MySQL/ NoSQL

2. Business drives the continuous progress of technology

10. Wikipedia’s high-performance architecture design analysis

A.Wikipedia website as a whole Architecture:

LAMP open source products, GeoDNS, LVS, Squid, Lighttpd, PHP, Memcached, Lucene, MySQL

B.Wikipedia performance optimization strategy

1. Front-end performance optimization

    The core of the front-end architecture is the reverse proxy server Squid cluster, which is load balanced by LVS and returned through CDN before the reverse proxy.

  • Wikipedia CDN caching guidelines: content pages do not contain dynamic information; each content page has a unique REST-style URL; cache control information is written in the HTML response header;

  • 2. Server-side performance optimization: Use APC, Imagemagick, Tex, and replace PHP's string search function starter() with a more optimized algorithm

3. Back-end performance optimization:

Cache

    Data in particularly concentrated hot spots is cached directly into the local memory of the application server

  • The content of cached data should be in a format that can be directly used by the application server.

  • Use the cache server to store session objects

  • Compared with databases, Memcached's persistent connections are very cheap. Create one if necessary

MySQL

    Use larger server memory

  • Use RAID0 disk array for normal access

  • Set database transaction consistency to a lower level

  • If the Master database goes down, immediately switch the application to the Salve database and shut down the write service

11. High-availability architecture design analysis of Doris, a massive distributed storage system

For a data storage system, high availability means: high availability Available services, highly reliable data

A. High-availability architecture of distributed storage system

1. Redundancy: server hot backup, multiple data storage

2. Overall system division:

  • Application server: The client of the storage system initiates a data operation request to the system

  • Data storage server: The core of the storage system, stores data and responds to data operation requests from the application server

  • Management center server: Main-master composed of two machines A hot backup small-scale server cluster is responsible for cluster management, health heartbeat detection of data storage clusters; cluster expansion and fault recovery management; providing cluster address configuration information services to application servers, etc.

B. High availability solutions under different fault conditions

1. Fault classification of distributed storage systems: instantaneous fault, temporary fault, permanent fault

2. Transient fault resolution: multiple retries

3. Temporary fault resolution: manual intervention is required, the problematic server uses a temporary storage server

4. Permanent fault resolution: enable backup server replacement Permanently invalid server

12. Online shopping flash sale system architecture design case analysis

A. Technical challenges of flash sale activities:To the existing website business Impact, high concurrency applications, database load, sudden increase in network and server bandwidth, direct order placement

B. Response strategy for the flash sale system

  • Independent deployment of flash sale system

  • Static flash sale product page

  • Rent network bandwidth for flash sale activities

  • Dynamicly generate random order page URL

C. Flash sale system architecture design

1. How to control the lighting of the purchase button on the flash sale product page: use a JS file, modify the content at the beginning, request it every time, not be cached by CDN, etc., and use a random version number.

2. How to only allow the first submitted order to be sent to the order subsystem: control the entrance to the order page, so that only a few users can enter, and other users directly enter the flash sale end page. For example, there are 10 servers, each processing 10 requests. When the number of requests exceeds 10, the others will return errors, and then request the global cache record. If it is the first one, it will enter the order page, and the others will return failures.

13. Analysis of typical failure cases of large websites

A. Writing logs can also cause failures

  • The application's own log output configuration and the log output of third-party components must be configured separately

  • Check the log configuration file, let's play with the log. Michelle considers at least Warn

  • Need to turn off too many Error logs that some third-party components may output

B. High Faults caused by concurrent access to the database

  • The homepage should not access the database

  • The homepage should be static

C. Faults caused by locks under high concurrency conditions

Be careful when using lock operations

D. Cache Faults caused

The cache server is already an integral part of the website architecture and needs to be managed at the same level as the database

E. Faults caused by out-of-sync application startup

F. Faults caused by exclusive disk reading and writing of large files

Do not share storage for small files and large files

G .Faults caused by abusing the production environment

Be extra careful when accessing the production environment, and please have a dedicated DBA maintain the database

H.Faults caused by non-standard processes

Use the diff command to compare the code before submitting it to confirm that no code that should not be submitted is submitted; strengthen code review, have at least one other engineer conduct a code review before submission and share the responsibility for failures caused by the code

I. Failures caused by bad programming habits

Pay attention to the handling of empty objects, null values, etc.

14. Architect The art of leadership

A. Focus on people rather than products

1. A group of excellent people doing something they love will definitely achieve success

2. The best software management is to explore the outstanding potential of each member of the project team

3. Find a goal worth striving for together and create a job where everyone can maximize their self-worth. Atmosphere

B. Discover the excellence of people

1. Things make people, not people make things

2.Most people , including ourselves, we are all better than we think. Some excellence needs to be stimulated in the right environment, such as doing something challenging, cooperating with better people, or having the courage to surpass ourselves

3. Discovering the excellence of people is far more meaningful than discovering outstanding people

C. Sharing a beautiful blueprint

1. The blueprint should be clearly stated of: what the product should do, what it should not do, and what business goals it should achieve

2. The blueprint should be visual: what value can the product create for users, what market goals can it achieve, and what will the product look like in the end?

3. The blueprint should be simple: in one sentence Understand: What are we doing?

4. Architects should maintain focus on the target blueprint and be alert to any designs and decisions that deviate from the blueprint. Wrong deviations must be corrected in a timely manner. Necessary changes must be discussed by everyone, and Need to regain everyone's approval.

D. Jointly participate in the architecture

1. Don’t be the only architect who owns the architecture

2. Let others maintain the framework and architecture documentation

E. Learn to compromise

Opinions against architectural and technical solutions are essentially paying attention to, trying to understand and accept these solutions. Architects should not be too sensitive and should share their opinions frankly and seek common ground while reserving differences

2. Arguments about technical details should be verified immediately instead of continuing to discuss

3. When everyone is not discussing the architecture, make it clear Architecture has been integrated into projects, systems and developers. The sooner the architect is forgotten, the more successful the architecture is.

F. Achieve others

1. Our work Not only to produce products, but also to achieve people, and ultimately ourselves

2. To complete a project, we must not only create value for customers and make profits for the company, but also allow project members to grow

3. As the technical leader of the team, the architect should not try to control anything during the project process. Advance with a flexible plan and blueprint, and the team will take care of themselves

15. Website Architect Career Guide

  • #The purpose of developing software is to solve real-world problems, but many times people don’t know what the real problems are.

  • Many problems will also be encountered in the software development process. It is necessary to coordinate the interests of all parties to obtain the greatest possible support. It is necessary to balance customer needs, software output, and development. The relationship between resources requires many things to be done to realize the original blueprint of software design.

A. Discover problems and find breakthroughs

When expectations cannot be met, people will feel that something is wrong, because there is a problem It is the gap between experience and expectation. There are two ways to eliminate problems: improve the experience or lower expectations. Just lowering expectations will not solve the problem. On the contrary, facing the difference between expectations and actual experience can identify the problem and find a breakthrough.

2. The first thing new employees need to do is to integrate into the team

3. The last thing new employees need to do is to prove their abilities.

B. Ask questions and seek support

1. The problem is found, it is only the problem of the problem finder, not the problem owner, if you want To solve a problem, you must raise the issue and let the owner of the problem know that the problem exists.

2. Tips for raising questions:

  • Represent "my problem" as "our problem"

  • Ask closed-ended questions to your boss (give AB options for the boss to choose which one is better), and ask open-ended questions to your subordinates

  • Point out the problem instead of criticizing the person

  • Ask questions in an agreeable way

#3. The so-called outspokenness means that the intention you want to express must be stated directly. Understand, don’t talk in circles, but be careful in the way of expression, and take into account the feelings of the parties involved

C. Solve the problem and achieve performance

1. Solve my problem Before you solve the problem, solve your problem first

  • If you help others solve their problem, others will also help you solve their problem

  • In In the process of helping others solve problems, I became familiar with the situation

  • You use your solution to solve other people's problems, and this solution is under your control

2. Proper escape from problems

16. Manhua website architect

A. Divided by function Architect

Design architect, fire-fighting architect, evangelist architect, Geek architect

B.Divide architects by effect

Sherpa architect: usually develops the most technically difficult and challenging modules in the project, Spartan architect, dignitary architect

C. Press Division of Responsibilities and Roles Architect

Product Architect: Participate in the entire life cycle of the product, basic service architect (platform architect), infrastructure architect

D .Divide architects by level of concern

Architects who only focus on functions, architects who focus on non-functions, architects who focus on team organization and management, architects who focus on product operations, and architects who focus on the future of the product Architect

E. Divide architects by word of mouth

The best architect, a good architect, an average architect, a poor architect, and the worst Architects

F. Non-mainstream ways to divide architects

Ordinary architects, literary architects, 1 1 architects

Appendix A: Overview of large-scale website technologies

A. Front-end architecture

Browser optimization technology, CDN, static and dynamic separation, independent deployment of static resources, image service, Reflection proxy, DNS

B. Application layer architecture

Development framework, page rendering, load balancing, Session management, dynamic page staticization, business splitting, virtualized server

C. Service layer architecture

Distributed messaging, distributed services, distributed cache, distributed configuration

D. Storage layer architecture

Distributed files, relational databases, NoSQL databases, data synchronization

E.Backend architecture

Search engine, data warehouse, recommendation system

F.Data collection (log) and monitoring

Browser data collection, server business collection, server performance data collection, system monitoring, system alarm

G. Security architecture

Web attack , Data protection

H. Data center computer room architecture

Computer room, cabinet, server

The above is the detailed content of Mysql large website technical architecture core case analysis. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete