How to prevent split-brain in an HA cluster
1. Introduction
Split-brain refers to a high-availability (HA) system that when connected When two nodes are disconnected, the system that was originally a whole is split into two independent nodes. At this time, the two nodes begin to compete for shared resources, resulting in system chaos and data damage.For HA of stateless services, it does not matter whether it is split-brain or not; but for HA of stateful services (such as MySQL), split-brain must be strictly prevented. (But some systems in production environments configure stateful services according to the stateless service HA set, and the results can be imagined...)
2. How to prevent HA cluster split-brain
Generally, 2 methods are used1. Arbitration
When two nodes disagree, the arbiter of the third party decides who to listen to. This arbiter may be a lock service, a shared disk or something else.
2. fencing
When the status of a node cannot be determined, kill the other node through fencing to ensure that the shared resources are completely released. The premise is that there must be reliable fence equipment.
Ideally, neither of the above should be missing.
However, if the node does not use shared resources, such as database HA based on master-slave replication, we can also safely omit the fence device and only retain the quorum. And many times there are no fence devices available in our environment, such as in cloud hosts.
So can we omit arbitration and only keep the fence device?
No. Because when two nodes lose contact with each other, they will fencing each other at the same time. If the fencing method is reboot, then the two machines will restart continuously. If the fencing method is power off, then the outcome may be that two nodes die together, or one may survive. But if the reason why two nodes lose contact with each other is that one of the nodes has a network card failure, and the one that survives happens to be the faulty node, then the ending will be tragic.
So, a simple double node cannot prevent split-brain anyway.
3. Is the device safe without a fence?
Take the data replication of PostgreSQL or MySQL as an example to illustrate this issue.In a replication-based scenario, the master-slave nodes do not share resources, so there is no problem if both nodes are alive. The question is whether the client will access the node that is supposed to be dead. This again involves the issue of client routing.
There are several methods for client routing, based on VIP, based on Proxy, based on DNS or simply the client maintains a list of server addresses to determine the master and slave by itself. No matter which method is used, the routing must be updated when the master-slave switches.
Routing based on DNS is not reliable because DNS may be cached by the client and is difficult to clear.
VIP-based routing has some variables. If the node that is supposed to die does not remove its VIP, it may come out to cause trouble at any time (even if the new owner has updated the arp cache on all hosts through arping , if the arp of a certain host expires and an arp query is sent, an ip conflict will occur). Therefore, it can be considered that VIP is also a special shared resource and must be removed from the faulty node. As for how to pick it, the simplest way is to pick it by itself after the faulty node discovers that it has lost contact, if it is still alive (if it is dead, there is no need to pick it). What if the process responsible for extracting VIP cannot work? At this time, you can use soft fence devices that are not reliable (such as ssh).
Proxy-based routing is more reliable, because Proxy is the only service entrance. As long as the Proxy is updated in one place, the problem of client misaccess will not occur, but Proxy must also be considered. High availability.
As for the method based on the server address list, the client needs to determine the master and slave through the background service (such as whether the PostgreSQL/MySQL session is in read-only mode). At this time, if there are two masters, the client will be confused. In order to prevent this problem, the original master node must stop the service by itself after discovering that it has lost contact. This is the same as the previous VIP removal.
Therefore, in order to prevent the faulty node from causing trouble, the faulty node should release the resources by itself after losing contact. In order to cope with the failure of the process that releases the resources, a soft fence can be added. Under this premise, it can be considered that it is safe without reliable physical fence equipment.
4. Can data be lost after master-slave switching?
Whether data will be lost after master-slave switching and brain splitting can be considered two different issues. Also take the data replication of PostgreSQL or MySQL as an example to illustrate.For PostgreSQL, if configured for synchronous streaming replication, no data will be lost regardless of whether the routing is correct. Because the client routed to the wrong node cannot write any data at all, it will always wait for feedback from the slave node, and the slave node it thought was now the master, of course, will ignore it. Of course, it is not good if this happens all the time, but it provides sufficient time for the cluster monitoring software to correct routing errors.
For MySQL, even if it is configured for semi-synchronous replication, it may automatically downgrade to asynchronous replication after a timeout occurs. In order to prevent MySQL replication from being degraded, you can set an extremely large rpl_semi_sync_master_timeout while keeping rpl_semi_sync_master_wait_no_slave on (the default value). However, if the slave fails at this time, the master will also stop. The solution to this problem is the same as PostgreSQL, either configuring it as 1 master and 2 slaves, as long as both slaves are not down, it will be fine, or using external cluster monitoring software to dynamically switch between semi-synchronous and asynchronous.
If it is originally configured asynchronous replication, it means that you are ready to lose data. At this time, it’s not a big deal to lose some data when switching between master and slave, but the number of automatic switches must be controlled. For example, the original owner whose control has been failed over is not allowed to go online automatically. Otherwise, if failover occurs due to network jitter, the master and slave will keep switching back and forth, losing data, and destroying data consistency.
5. How to implement the above strategy
You can implement a script that conforms to the above logic from scratch. But I prefer to build it based on mature cluster software, such as Pacemaker Corosync and appropriate resource agents. I highly do not recommend Keepalived. It is not suitable for HA of stateful services. Even if you add arbitration and fences to the solution, it always feels awkward.There are also some precautions when using Pacemaker Corosync
1) Understand the functions and principles of Resource Agent
Only by understanding the functions and principles of Resource Agent can you know the scenarios it is applicable to. For example, the resource agent of pgsql is relatively complete, supports synchronous and asynchronous stream replication, and can automatically switch between the two, and can ensure that data will not be lost during synchronous replication. But the current resource agent of MySQL is very weak. Without GTID and without log compensation, it is easy to lose data. It is better not to use it and continue to use MHA (but be sure to guard against split-brain when deploying MHA).
2) Ensure the quorum (quorum)
Quorum can be considered as Pacemkaer’s own arbitration mechanism. A majority of all nodes in the cluster elects a coordinator, and all instructions in the cluster are controlled by this coordinator. Issued, it can perfectly eliminate the problem of split brain. In order for this mechanism to work effectively, there must be at least 3 nodes in the cluster, and no-quorum-policy is set to stop, which is also the default value. (Many tutorials set no-quorum-policy to ignore for the convenience of demonstration. If the production environment does this and there is no other arbitration mechanism, it is very dangerous!)
However, if there are only 2 nodes what to do?
The first is to borrow a machine to gather 3 nodes, and then set location restrictions to prevent resources from being allocated to that node.
The second is to pull together multiple small clusters that do not meet the quorum to form a large cluster. Location restrictions are also applied to control the location of resource allocation.
But if you have many two-node clusters, you can’t find so many nodes to make up the number, and you don’t want to pull these two-node clusters together to form a large cluster (for example, you find it inconvenient to manage). Then you can consider the third method.
The third method is to configure a preempted resource, as well as services and colocation constraints of this preempted resource. Whoever seizes the preempted resource will provide the service. This preempted resource can be a lock service, such as one packaged based on zookeeper, or simply make one from scratch, like the following example.
http://my.oschina.net/hanhanztj/blog/515065
(This example is a short connection based on the http protocol. A more detailed approach is to use long connection heartbeat detection so that the server can detect it in time The lock is released when the connection is disconnected)
However, you must also ensure the high availability of this preempted resource. You can make the service that provides preempted resources into lingyig high availability, or you can be simpler and deploy 3 services on dual nodes. One is deployed first, and the third one is deployed on another dedicated arbitration node. The lock is considered to be acquired when at least 2 of the 3 locks are obtained. This quorum node can provide quorum services for many clusters (because a machine can only deploy one Pacemaker instance, otherwise you can use an arbiter node with N Pacemaker instances deployed to do the same thing.). However, if you have no last resort, try to use the previous method, that is, to meet the Pacemaker's statutory number of votes. This method is simpler and more reliable.
6. Reference
http://blog.chinaunix.net/uid-20726500-id-4461367.htmlhttp://my.oschina.net/hanhanztj/blog /515065
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-single/Pacemaker_Explained/index.html
http://clusterlabs.org/wiki/PgSQL_Replicated_Cluster
http://mysqllover.com/?p=799
http://gmt-24.net/archives/1077

What’s still popular is the ease of use, flexibility and a strong ecosystem. 1) Ease of use and simple syntax make it the first choice for beginners. 2) Closely integrated with web development, excellent interaction with HTTP requests and database. 3) The huge ecosystem provides a wealth of tools and libraries. 4) Active community and open source nature adapts them to new needs and technology trends.

PHP and Python are both high-level programming languages that are widely used in web development, data processing and automation tasks. 1.PHP is often used to build dynamic websites and content management systems, while Python is often used to build web frameworks and data science. 2.PHP uses echo to output content, Python uses print. 3. Both support object-oriented programming, but the syntax and keywords are different. 4. PHP supports weak type conversion, while Python is more stringent. 5. PHP performance optimization includes using OPcache and asynchronous programming, while Python uses cProfile and asynchronous programming.

PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

PHP originated in 1994 and was developed by RasmusLerdorf. It was originally used to track website visitors and gradually evolved into a server-side scripting language and was widely used in web development. Python was developed by Guidovan Rossum in the late 1980s and was first released in 1991. It emphasizes code readability and simplicity, and is suitable for scientific computing, data analysis and other fields.

PHP is suitable for web development and rapid prototyping, and Python is suitable for data science and machine learning. 1.PHP is used for dynamic web development, with simple syntax and suitable for rapid development. 2. Python has concise syntax, is suitable for multiple fields, and has a strong library ecosystem.

PHP remains important in the modernization process because it supports a large number of websites and applications and adapts to development needs through frameworks. 1.PHP7 improves performance and introduces new features. 2. Modern frameworks such as Laravel, Symfony and CodeIgniter simplify development and improve code quality. 3. Performance optimization and best practices further improve application efficiency.

PHPhassignificantlyimpactedwebdevelopmentandextendsbeyondit.1)ItpowersmajorplatformslikeWordPressandexcelsindatabaseinteractions.2)PHP'sadaptabilityallowsittoscaleforlargeapplicationsusingframeworkslikeLaravel.3)Beyondweb,PHPisusedincommand-linescrip

PHP type prompts to improve code quality and readability. 1) Scalar type tips: Since PHP7.0, basic data types are allowed to be specified in function parameters, such as int, float, etc. 2) Return type prompt: Ensure the consistency of the function return value type. 3) Union type prompt: Since PHP8.0, multiple types are allowed to be specified in function parameters or return values. 4) Nullable type prompt: Allows to include null values and handle functions that may return null values.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 English version
Recommended: Win version, supports code prompts!

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

SublimeText3 Mac version
God-level code editing software (SublimeText3)

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Atom editor mac version download
The most popular open source editor