来自ZooKeeper FAQ :
Reliability:
A single ZooKeeper server (standalone) is essentially a coordinator with
no reliability (a single serving node failure brings down the ZK service).
A 3 server ensemble (you need to jump to 3 and not 2 because ZK works
based on simple majority voting) allows for a single server to fail and
the service will still be available.
So if you want reliability go with at least 3. We typically recommend
having 5 servers in "online" production serving environments. This allows
you to take 1 server out of service (say planned maintenance) and still
be able to sustain an unexpected outage of one of the remaining servers
w/o interruption of the service.
对于 3 台服务器的整体,如果一台服务器停止轮换并且一台服务器意外中断,则仍然有一台剩余服务器应确保服务不会中断。那为什么需要5台服务器呢?或者所考虑的不仅仅是中断服务?
更新:
感谢@sbridges 指出这与维持法定人数有关。 ZK 定义法定人数的方式是 ceil(N/2)
,其中 N
是集合中的原始数字(而不仅仅是当前可用集)。
现在,Google 搜索 ZK quorum 在 HBase 书籍 chapter on ZK 中找到了这一点。 :
In ZooKeeper, an even number of peers is supported, but it is normally not used because an even sized ensemble requires, proportionally, more peers to form a quorum than an odd sized ensemble requires. For example, an ensemble with 4 peers requires 3 to form a quorum, while an ensemble with 5 also requires 3 to form a quorum. Thus, an ensemble of 5 allows 2 peers to fail and still maintain quorum, and thus is more fault tolerant than the ensemble of 4, which allows only 1 down peer.
Edward J. Yoon 的 blog 中对维基百科的释义:
Ordinarily, this is a majority of the people expected to be there, although many bodies may have a lower or higher quorum.
请您参考如下方法:
Zookeeper 要求您拥有一定数量的服务器,其中法定数量为ceil(N/2)
。对于 3 台服务器的整体,这意味着 2 台服务器必须随时启动,对于 5 台服务器的整体,3 台服务器需要随时启动。