In addition to reading this guide, we recommend you run the Elasticsearch Health Check-Up. It will detect issues and improve your Elasticsearch performance by analyzing your shard sizes, threadpools, memory, snapshots, disk watermarks and more.The Elasticsearch Check-Up is free and requires no installation.
Elasticsearch is a distributed system and may contain one or more nodes in each cluster. For a cluster to become operational, Elasticsearch needs a quorum of a minimum number of master nodes. By default, every node in Elasticsearch is master eligible. These master nodes are responsible for all the cluster coordination tasks to manage the cluster state.
When you create a cluster, no matter how many nodes you are configuring, the quorum is by default set to one. That means if a cluster has one master node in the operational state, the cluster can work. But if you are running a production cluster for more than two nodes, you should configure an odd number of dedicated master nodes. Usually, most of the clusters are configured with at least three dedicated master nodes and the quorum – the minimum number of master nodes – is set to two.
Read below to see why it is recommended to configure an odd number of master nodes and why it’s important to set the quorum of minimum master nodes. Setting the quorum of the minimum master nodes is controlled by the following parameters in the elasticsearch.yml file on every node:
discovery.zen.ping.unicast.hosts: ["host1:tcp_port", "host2:tcp_port", "host3:tcp_port"] discovery.zen.minimum_master_nodes: 2
Note: You need to add only the host address and port of only master eligible nodes under discovery.zen.ping.unicast.hosts setting on all of the nodes, including the master nodes. A common mistake of users is adding the host information of all of the nodes under this setting.
The split-brain problem
At any given time, there is only one master node in the cluster among all the master eligible nodes. Split-brain is a situation when you have more than one master in the cluster.
Let’s take for example a cluster that has two master eligible nodes, M1 and M2, with the quorum of minimum_master_node set to one. The split-brain situation can occur in the cluster if both M1 and M2 are alive and the communication network between M1 and M2 is interrupted. When that occurs, both M1 and M2 consider themselves to be alone in the cluster and both elect themselves as the master. At this point, your cluster will have two master nodes and you have a split-brain situation.
Best practices to avoid the split-brain problem
The split-brain problem can be avoided by setting the minimum number of master nodes using the following formula:
minimum_master_nodes = (N/2)+1
Where N is the total number of master eligible nodes in the cluster. The value of the minimum_master_nodes is set by taking round-off to the nearest integer value. For example, if the total number of master eligible nodes is 3, then the minimum_master_nodes will be set to 2.
Considerations for different Elasticsearch versions
The concept we described in the previous sections is applicable to all the Elasticsearch versions before 7.0. In Elasticsearch version 7.0, the discovery module which is responsible for all these cluster communication settings has gone through a complete revamp and you don’t have to worry much about setting the quorum of the minimum number of master nodes. Elasticsearch now decides by itself which nodes are needed to form a quorum. Both of the settings, discovery.zen.ping.unicast.hosts and discovery.zen.minimum_master_nodes, have been removed from the settings.
discovery.zen.ping.unicast.hosts is renamed to discovery.seed_hosts and a new setting cluster.initial_master_nodes decides the initial set of master eligible nodes while cluster bootstrapping process.
Split-brain is when you have more than one master node in the cluster.
The split-brain problem can be avoided by setting the minimum number of master nodes using the following formula: minimum_master_nodes = (N/2)+1
Always keep up to date with the latest best practices. It’s a good idea to run the Elasticsearch error check-up (read more here) at least once a month. This will allow you to make sure your cluster is up to date with the latest configuration recommendations.
In Elasticsearch version 7.0, the discovery module which is responsible for all these cluster communication settings has gone through a complete revamp and you don’t have to worry much about setting the quorum of the minimum number of master nodes.