Home  >  Q&A  >  body text

MongoDB副本集问题

假设有这样一个场景,有一个MongoDB的副本集,由于故障导致只剩下两个节点可用,而这两个节点目前都是slave节点。其他的故障节点也无法再重新启动,即无法重新加入到副本集中。
Q:这种情况下是否因为这个副本集就无法使用了?

PHPzPHPz2734 days ago779

reply all(2)I'll reply

  • PHP中文网

    PHP中文网2017-04-24 09:13:00

    When most of the nodes are down and the number of remaining nodes that can be connected to each other does not exceed half, you can refer to the documentation to reconfigure the replica set:
    - http://docs.mongodb.org/manual/tutorial/reconfigure-replica-set-with-unavailable-members/

    Two methods are mentioned in the document:

    1. One is to forcefully reconfigure this replica set, delete the down nodes from this replica set, and only the remaining running nodes form a new replica set, so that a new primary node (primary node) can be elected. ). If the MongoDB version is 2.0 or above, you can use this method.
    2. Another way is to replace this replica set. If your MongoDB version is below 2.0, you can use this method.

    reply
    0
  • 迷茫

    迷茫2017-04-24 09:13:00

    This depends on the total number of nodes in your replica set. When the number of nodes that can be contacted with each other in the replica set is greater than half of the total number of nodes, a new primary node can be selected. The replica set can work normally. If it can be contacted with each other If the nodes are less than or equal to half of the summary point, all nodes will become secondary nodes. At this time, the replica set will become read-only and all write operations will fail.

    The reason for this phenomenon is that the mongodb replica set does not allow multiple primary nodes. When the number of nodes that can be contacted is less than or equal to half of the total number of nodes, if the primary can still be selected, multiple primary nodes may appear. This causes data chaos in the entire replica set and all nodes become secondary. The replica set can still run normally when the failed node is restored.

    If you encounter a normal node that is not enough to select the primary, you can solve it in several ways.

    1) 若其他节点因数据损坏不能启动,像你说的依然有两个节点存活,可以停止一个节点,将硬盘数据导出至挂掉的节点,启动即可.
    
    2) 若其他节点因不可恢复原因导致不能启动,你可以去掉replset选项将此节点作为单机服务启动,若要恢复为副本集模式,可以试一下将一个正常节点的local相关的数据文件删除,重新以replset启动,初始化自身之后使用rs.add()添加新节点,线上没有遇到过这种情况,如果实在没办法,你可以试一下.
    

    reply
    0
  • Cancelreply