Elasticsearch Node Concurrent Recoveries Setting is Too High / Low

Elasticsearch Node Concurrent Recoveries Setting is Too High / Low

Opster Team

March 2021


In addition to reading this guide, we recommend you run the Elasticsearch Health Check-Up. It will detect issues and improve your Elasticsearch performance by analyzing your shard sizes, threadpools, memory, snapshots, disk watermarks and more.

The Elasticsearch Check-Up is free and requires no installation.

Run the Elasticsearch check-up to receive recommendations like this:

checklist Run Check-Up
error

The following configuration error was detected on node 123...

error-img

Description

This error can have a severe impact on your system. It's important to understand that it was caused by...

error-img

Recommendation

In order to resolve this issue and prevent it from occurring again, we recommend that you begin by changing the configuration to...

1

X-PUT curl -H "Content-Type: application/json" [customized recommendation]

An overview of Node_Concurrent_Recoveries_High and Node_Concurrent_Recoveries_Low. 

What it means

The node concurrent recoveries setting determines the maximum number of shards that can be recovered at once from each node. Recovering shards requires both disk and network resources, so it is advisable to limit the number of shards that can be recovered from a given node at any one time. 

If, on the other hand, the concurrent recoveries setting is too limited and is set too low, the cluster may not be able to recover shards at all, or recovery may be slower than usual. This could create performance issues since the cluster has fewer replicas than planned, or may even leave the index unwritable, with the cluster staying yellow or red for a long period of time.  

There are a number of different settings that are similar but have subtle differences:

cluster.routing.allocation.node_concurrent_incoming_recoveries (default 2)

How many concurrent incoming shard recoveries (normally replicas) are allowed to happen on a node. 

cluster.routing.allocation.node_concurrent_outgoing_recoveries (default 2)

How many concurrent outgoing shard recoveries are allowed to happen on a node. 

cluster.routing.allocation.node_concurrent_recoveries (default 2)

This is a convenience function to simultaneously set both cluster.routing.allocation.node_concurrent_incoming_recoveries and cluster.routing.allocation.node_concurrent_outgoing_recoveries.

cluster.routing.allocation.node_initial_primaries_recoveries (default 4)

This is different from the above because it involves the recovery of a primary node using data from the local disk. Because these operations don’t require networking, a larger number of operations may be carried out in parallel on the same node.

How to resolve it

Check the current cluster settings:

GET _cluster/settings

If necessary, change the concurrent recovery settings. In general the defaults are good values to use.

PUT _cluster/settings
{
  "transient": {
    "cluster.routing.allocation.node_concurrent_recoveries ": 2
  }
}

Run the Elasticsearch check-up to receive recommendations like this:

checklist Run Check-Up
error

The following configuration error was detected on node 123...

error-img

Description

This error can have a severe impact on your system. It's important to understand that it was caused by...

error-img

Recommendation

In order to resolve this issue and prevent it from occurring again, we recommend that you begin by changing the configuration to...

1

X-PUT curl -H "Content-Type: application/json" [customized recommendation]



Improve Elasticsearch Performance

Run The Analysis