Elasticsearch OpenSearch Max Shards Per Node Exceeded

Opster Team

Oct 30, 2022 | 2 min read


In addition to reading this guide, we recommend you run the Elasticsearch Health Check-Up. It will detect issues and improve your Elasticsearch performance by analyzing your shard sizes, threadpools, memory, snapshots, disk watermarks and more.

The Elasticsearch Check-Up is free and requires no installation.

To manage all aspects of your OpenSearch operation, you can use Opster’s Management Console (OMC). The OMC makes it easy to orchestrate and manage OpenSearch in any environment. Using the OMC you can deploy multiple clusters, configure node roles, scale cluster resources, manage certificates and more – all from a single interface, for free. Check it out here.

Quick links

Overview

OpenSearch permits you to set a limit of shards per node, which could result in shards not being allocated once that limit is exceeded. The effect of having unallocated replica shards is that you do not have replica copies of your data, and could lose data if the primary shard is lost or corrupted (cluster yellow).

The outcome of having unallocated primary shards is that you are not able to write data to the index at all (cluster red). If you get this warning it is important to take the necessary actions to fix it as soon as possible.

The shards per node limit may have been set up at an index level or at a cluster level, so you need to find out which of the settings are causing this warning.

How to fix it

Check to see whether the limit is at a cluster level or index level.

Cluster level shards limit

Run: 

GET /_cluster/settings

Look for a setting:

cluster.routing.allocation.total_shards_per_node

If you don’t see the above setting, then ignore this section, and go to index level shards limit below.

As a quick fix you can either delete old indices, or increase the number of shards to what you need, but be aware that a large number of shards on your node can cause performance problems, and in an extreme cases even bring your cluster down.

PUT /_cluster/settings
{
  "transient": {
	"cluster.routing.allocation.total_shards_per_node": 1000
  }
}

It is preferable to apply a permanent fix. To see examples of solutions to this issue in Elasticsearch (where the same principles apply), check out Shards Too Small (Oversharding) in Elasticsearch – Explained and Elasticsearch Search Latency Due to Bursts of Traffic – A Complete Guide to learn more.

Index level shards limit

It is possible to limit the number of shards per node for a given index. Check the settings for the yellow or red index with:

GET /<index>/_settings/index.routing*

Look for the setting: index.routing.allocation.total_shards_per_node

This setting is sometimes used to force OpenSearch to spread nodes on a certain index across a cluster, but may come into conflict with other cluster allocation settings (eg. if the disk is getting full on one node, or if the number of nodes has reduced).

Before changing the setting, it is probably worth considering why OpenSearch is unable to respect the rule, and fixing the root cause (ie delete old indices, or recover/replace a node which is down). However if that is not possible, if the current setting is just wrong, or if you only need a short term fix then you can change the index level setting using the following:

PUT <index>/_settings
{"index.routing.allocation.total_shards_per_node":-1}

Note in the code above -1 = Unbounded, or set the number to whatever you need.



Find and fix issues Elasticsearch issues, try AutoOps

Watch product tour

Try For Free
Skip to content