A node will run an instance of ES, and a cluster will contain one or more nodes with the same cluster.name
. These nodes work together to complete data sharing and load balancing. As nodes are added to or removed from the cluster, the cluster adjusts itself to evenly distribute data. One node in the cluster will be elected as the master node (Master
Node), which is responsible for managing changes in the entire cluster, such as creating or deleting an index (Index), adding or deleting nodes to the cluster. Any node can become the master node. In our example there is only one node, so it assumes the function of the master node. ES distributes data across the cluster through sharding. Think of shards as containers for data. Documents are stored in shards, and shards are distributed among nodes in the cluster. As the cluster expands and becomes smaller, ES will automatically migrate shards between nodes to ensure that the cluster maintains a balance. A shard can be a primary shard (Primary Shard) or a replica shard (Replica Shard). Each document in the index belongs to a primary shard, so the number of primary shards determines the maximum amount of data your index can store. A replica shard is just a copy of the primary shard. Replicas are used to provide data redundancy, to protect data from loss in the event of hardware failure, and to handle read requests such as searching and retrieving documents. The number of primary shards is determined at the beginning of the index creation, while the number of replica shards can be changed at any time.
For specific principles, please refer to the official document: "life
inside a cluster》 Demonstrates horizontal expansion. Here, a new virtual machine of an ES instance is added, so that our previous ES instance was: 10.253.1.70. Now we add a new node: 10.253.1.71. We need to ensure that these two Nodes can communicate with each other. Configuration config/elasticsearch.yml10.253.1.70 The relevant configuration is:
1 cluster.name:
elasticsearch_ryan
2node.name:"cluster-node-1"
10.253.1.71 related configuration is:
1cluster.name:
elasticsearch_ryan
2node.name:"cluster-node-1"
In fact, it is to ensure that there is a common cluster. nameStart the ES service of 10.253.1.71, and then you can check the status of the node cluster:
01curl
-XPOST "http://10.253.1.70:9200/_cluster/health"
03 "cluster_name":"elasticsearch_ryan",
07 "number_of_data_nodes":
2,
08 "active_primary_shards":
9,
10 "relocating_shards":
0,
11 "initializing_shards":
0,
:
0
You can see the status of the cluster and the specific meaning of the status : green: All primary shards (Primary Shard) and replica shards (Replica Shard) are activeyellow: All primary shards are active, but not all replica shards are active red: Not all primary shards are active
Here we recommend an ES distributed cluster management tool elasticsearch-head, which can be installed as a plug-in
sudo elasticsearch/bin/plugin -install mobz/elasticsearch- headAfter installation, open the management interface http://10.253.1.70:9200/_plugin/head/Elasticsearch and MongoDB data synchronization and distributed cluster construction
You can see the nodes in the distributed cluster For detailed information, you can also perform index information and query functions, which is very convenient, and the status of the cluster is also very intuitive. You can continue to add some data to mongo to test.
The above has introduced Elasticsearch and MongoDB data synchronization and distributed cluster construction (2), including the relevant content. I hope it will be helpful to friends who are interested in PHP tutorials.