Elasticsearch Rebalance

Elasticsearch Rebalance

Opster Team

March 2021


In addition to reading this guide, we recommend you run the Elasticsearch Health Check-Up. It will detect issues and improve your Elasticsearch performance by analyzing your shard sizes, threadpools, memory, snapshots, disk watermarks and more.

The Elasticsearch Check-Up is free and requires no installation.

In addition to learning about cluster rebalancing and our management suggestions, we recommend you run the Elasticsearch Health Check-Up. It will detect issues and improve your Elasticsearch performance by analyzing your shard sizes, threadpools, memory, snapshots, disk watermarks and more.

The Elasticsearch Check-Up is free and requires no installation.

Run the Elasticsearch check-up to receive recommendations like this:

checklist Run Check-Up
error

The following configuration error was detected on node 123...

error-img

Description

This error can have a severe impact on your system. It's important to understand that it was caused by...

error-img

Recommendation

In order to resolve this issue and prevent it from occurring again, we recommend that you begin by changing the configuration to...

1

X-PUT curl -H "Content-Type: application/json" [customized recommendation]

Overview

Cluster rebalancing is the process by which an Elasticsearch cluster distributes data across the nodes. Specifically, it refers to the movement of existing data shards to another node to improve the balance across the nodes (as opposed to the allocation of new shards to nodes). Usually, it is a completely automatic process that requires no outside intervention. However, there are a number of parameters Elasticsearch uses to regulate this process.

Examples

The command below will establish the cluster settings to enable automatic cluster rebalancing. It is not necessary to run the command (the values used are in fact the defaults).

PUT /_cluster/settings?flat_settings=true
{
	"transient" : {
"cluster.routing.rebalance.enable": "all",
"cluster.routing.allocation.allow_rebalance":  "indices_all_active"	,
"cluster.routing.allocation.cluster_concurrent_rebalance":"2"  

	}
}

Notes and good things to know

In general, the cluster rebalance settings have sensible defaults. It is generally not advisable to disable cluster rebalancing. It is usually most sensible to wait until indices are all active before rebalancing since we consider the highest priority to be recovering the indices rather than moving them around. Finally, it is recommended to limit the number of concurrent rebalances to 2 (the default) since having a large number of shards moving around at a given time can use a lot of resources resource and cause instability. Increasing this number would only make sense on large clusters.

You can consider the “rebalance” process to be a tendency to spread the total number of shards across all nodes in the cluster, and also to spread the total number of shards for a given index as evenly as possible across the cluster. The rebalance is a “soft” algorithm, and will be overruled by other “hard” factors such as disk-based or shard allocation awareness.  

If you think your cluster is not rebalancing as it should first check the “hard” limits you have on shard allocation awareness or disk-based shard allocation before tweaking the rebalance parameters.

Manual rebalancing

It is also possible to rebalance manually using a command like this:

POST /_cluster/reroute?dry_run=true
{
    "commands" : [
        {
            "move" : {
                "index" : "test", "shard" : 0,
                "from_node" : "node1", "to_node" : "node2"
            }
        }
    ]
}

It is advisable to include the dry_run parameter to check the result of your action, and if everything is in order then repeat the command with dry_run=false.

Bear in mind that if you rebalance manually, Elasticsearch may move the same (or another shard) back automatically, compensating for your previous action. Similarly, there may be constraints that will prevent your reallocation from being accepted by the cluster.


Related log errors to this ES concept


Has a wrong value ; defaulting to indices_all_active
Updating cluster.routing.allocation.cluster-concurrent-rebalance from ; to
Cluster.routing.allocation.allow_rebalance has a wrong value . defaulting to indices_all_active
Updating cluster.routing.allocation.cluster_concurrent_rebalance from . to

Run the Check-Up to get a customized report like this:

Analyze your cluster