Skip to main content
 首页 » 编程设计

apache-zookeeper之ZooKeeper 可靠性之三个节点与五个节点

2024年08月06日21mfryf

来自ZooKeeper FAQ :

Reliability: 
 
A single ZooKeeper server (standalone) is essentially a coordinator with 
no reliability (a single serving node failure brings down the ZK service). 
 
A 3 server ensemble (you need to jump to 3 and not 2 because ZK works 
based on simple majority voting) allows for a single server to fail and 
the service will still be available. 
 
So if you want reliability go with at least 3. We typically recommend 
having 5 servers in "online" production serving environments. This allows 
you to take 1 server out of service (say planned maintenance) and still 
be able to sustain an unexpected outage of one of the remaining servers 
w/o interruption of the service. 

对于 3 台服务器的整体,如果一台服务器停止轮换并且一台服务器意外中断,则仍然有一台剩余服务器应确保服务不会中断。那为什么需要5台服务器呢?或者所考虑的不仅仅是中断服务?

更新:

感谢@sbridges 指出这与维持法定人数有关。 ZK 定义法定人数的方式是 ceil(N/2),其中 N 是集合中的原始数字(而不仅仅是当前可用集)。

现在,Google 搜索 ZK quorum 在 HBase 书籍 chapter on ZK 中找到了这一点。 :

In ZooKeeper, an even number of peers is supported, but it is normally not used because an even sized ensemble requires, proportionally, more peers to form a quorum than an odd sized ensemble requires. For example, an ensemble with 4 peers requires 3 to form a quorum, while an ensemble with 5 also requires 3 to form a quorum. Thus, an ensemble of 5 allows 2 peers to fail and still maintain quorum, and thus is more fault tolerant than the ensemble of 4, which allows only 1 down peer.

Edward J. Yoon 的 blog 中对维基百科的释义:

Ordinarily, this is a majority of the people expected to be there, although many bodies may have a lower or higher quorum.

请您参考如下方法:

Zookeeper 要求您拥有一定数量的服务器,其中法定数量为ceil(N/2)。对于 3 台服务器的整体,这意味着 2 台服务器必须随时启动,对于 5 台服务器的整体,3 台服务器需要随时启动。