Recommended (free): redis
The traditional ACID are What
A (Atomicity) Atomicity
C (Consistency) Consistency
I (Isolation) Independence
D (Durability) Durability
Relational database follows ACID rules. Transaction in English is very similar to transactions in the real world. It has the following four characteristics:
1. A (Atomicity) Atomicity
Atomicity is easy to understand, that is to say, All operations in the transaction are either completed or none. The condition for the success of the transaction is that all operations in the transaction are successful. As long as there is If one operation fails, the entire transaction fails and needs to be rolled back. For example, a bank transfer to transfer 100 yuan from account A to account B is divided into two steps: 1) Withdraw 100 yuan from account A; 2) Deposit 100 yuan into account B. These two steps are either completed together or not completed together. If only the first step is completed and the second step fails, the money will be 100 yuan less for no reason.
2. C (Consistency) Consistency
Consistency is also easier to understand, which means that the database must always be in a consistent state, The running of the transaction will not change the original consistency constraints of the database.
3. I (Isolation) Independence
The so-called independence means that concurrent transactions will not affect each other,If the data to be accessed by a transaction is being modified by another transaction, as long as the other transaction is not committed, the data it accesses will not be affected by the uncommitted transaction. For example, there is a transaction that transfers 100 yuan from account A to account B. If the transaction is not completed yet, if B checks his account at this time, he will not see the newly added 100 yuan.
4. D (Durability) Durability
Persistence means that once a transaction is committed, the modifications it makes will be permanently saved in the database , it will not be lost even if there is a downtime.
CAP
C:Consistency A:Availability
P:Partition tolerance ) or distributed tolerance
The CAP theory means that in a distributed storage system, at most the above two points can only be achieved.
Strong consistency: For example, what is in the data is what it is.
All data backups in the distributed system have the same value at the same time. (Equivalent to all nodes accessing the same latest copy of data) Availability: For example, it is impossible not to use Taobao Double Eleven.
After some nodes in the cluster fail, can the entire cluster still respond to the client's read and write requests. (High availability for data updates) Partition fault tolerance: In practical terms, partitioning is equivalent to the time limit requirement for communication.
If the system cannot achieve data consistency within the time limit, it means that a partition has occurred, and a choice must be made between C and A for the current operation. For example: Taobao bags
For strong consistency, we require that the number of likes for this bag is 141, which must not be wrong. Precise guidance is required, but it is difficult to ensure data uniformity in times of high concurrency.
For high availability: There can be weak consistency, such as allowing errors in the number of likes and views, but it cannot cause website paralysis.
So most website architectures use AP. Weak consistency and high availability
Partition tolerance must be achieved. The distributed system may not be in the same city, such as Taobao, and content distribution is closest to you. Taobao servers may have servers in Hangzhou, Shanghai and Suzhou. Since the current network hardware will definitely have problems such as delayed packet loss,
so partition tolerance is what we must achieve. So we can only make a trade-off between consistency and availability. No NoSQL system can guarantee these three points at the same time. CA Traditional Oracle Database
CP Redis, Mongodb
Note: Trade-offs must be made in distributed architecture.
Strike a balance between consistency and availability. Most web applications do not actually require strong consistency. Therefore, sacrificing C for P is the current direction of distributed database products.
The choice between consistency and availability
For web2.0 websites, many of the main features of relational databases are often useless
Database transaction consistency requirements
Many web real-time systems do not require strict database transactions and have very low requirements for read consistency. In some cases, the requirements for write consistency are not high. Allows for eventual consistency.
Requirements for real-time writing and reading of databases
For relational databases, if you insert a piece of data and query it immediately, you can definitely read the data, but for many web applications For example, it does not require such high real-time performance. For example, after posting a message on Weibo, it is completely acceptable for my subscribers to see this news after a few seconds or even more than ten seconds.
Requirements for complex SQL queries, especially multi-table related queries
Any web system with a large amount of data is very taboo about related queries on multiple large tables, as well as complex data analysis Types of report queries, especially SNS type websites, avoid this situation from the perspective of demand and product design. Often there are only primary key queries of a single table and simple conditional paging queries of a single table. The function of SQL is greatly weakened.
Classic CAP diagram
The core of CAP theory is: a distributed system cannot satisfy consistency, availability and partition fault tolerance at the same time. Of the three needs, at most two can be satisfied well at the same time.
Therefore, according to the CAP principle, NoSQL databases are divided into three categories: satisfying the CA principle, satisfying the CP principle and satisfying the AP principle:
CA - single-point cluster, system that meets consistency and availability , are generally less scalable.
CP - A system that satisfies consistency and must tolerate partitioning. Usually the performance is not particularly high.
AP - A system that meets availability, partition tolerance, and may generally have lower consistency requirements.
BASE
BASE is a solution proposed to solve the problems caused by the strong consistency of relational databases and the reduced availability.
BASE is actually the abbreviation of the following three terms:
Basically Available
Soft state
Eventually consistent
It The idea is to improve the overall scalability and performance of the system by allowing the system to relax its requirements for data consistency at a certain moment. Why do we say this? The reason is that large systems often cannot use distributed transactions to complete these indicators due to geographical distribution and extremely high performance requirements. To obtain these indicators, we must use another way to complete them. Here is BASE This is the solution to this problem
Introduction to distributed cluster
Distributed system
consists of multiple computers and communication software components through a computer network connection (local network or wide area network). Distributed systems are software systems built on the network. It is precisely because of the characteristics of software that distributed systems have a high degree of cohesion and transparency. Therefore, the difference between networks and distributed systems lies more in the high-level software (especially the operating system) than in the hardware. Distributed systems can be applied on different platforms such as PCs, workstations, LANs and WANs, etc.
To put it simply:
Distributed: Different service modules (projects) are deployed on multiple servers. They communicate and call through RPC/RMI to provide external services and intra-group services. cooperation.
Cluster: The same service module is deployed on multiple different servers, and unified scheduling is performed through distributed scheduling software to provide external services and access.
The above is the detailed content of redis explains the principle of distributed database CAP. For more information, please follow other related articles on the PHP Chinese website!