Elasticsearch Enable Adaptive Replica Selection

Elasticsearch Enable Adaptive Replica Selection

Opster Team

March 2021


In addition to reading this guide, we recommend you run the Elasticsearch Health Check-Up. It will detect issues and improve your Elasticsearch performance by analyzing your shard sizes, threadpools, memory, snapshots, disk watermarks and more.

The Elasticsearch Check-Up is free and requires no installation.

Run the Elasticsearch check-up to receive recommendations like this:

checklist Run Check-Up
error

The following configuration error was detected on node 123...

error-img

Description

This error can have a severe impact on your system. It's important to understand that it was caused by...

error-img

Recommendation

In order to resolve this issue and prevent it from occurring again, we recommend that you begin by changing the configuration to...

1

X-PUT curl -H "Content-Type: application/json" [customized recommendation]

Overview

Adaptive replica selection is a process intended to prevent a distressed Elasticsearch node from delaying the response to queries, while reducing the search load on that node.

To understand how it works, imagine a situation where a single node is in distress. This could be because of hardware, network or configuration issues, but as a consequence the response time for shards on that node are much longer than the response time from the other nodes.

When an Elasticsearch node receives a query, it needs to receive a response from all of the shards in all of the indices covered by that query so multiple nodes are usually involved in producing the response. Without adaptive replica selection, Elasticsearch would check which replicas are available from all the nodes including the node in distress, and request responses for each shard from the other nodes based on a “round robin” approach. Using adaptive replica selection, Elasticsearch will only request data from shards on a distressed node when there is no other alternative (i.e. when there are no other replicas), resulting in reduced load on distressed nodes, and shorter response times.

How to resolve it

By default, adaptive replica selection is enabled in version 7 and onwards. You can enable it in version 6.1 onwards by running the following:

PUT /_cluster/settings
{
  "transient": {
    "cluster.routing.use_adaptive_replica_selection": true
  }
}


Run the Check-Up to get a customized report like this:

Analyze your cluster