Elasticsearch Shards Too Large

Elasticsearch Shards Too Large

Opster Team

Nov 2020


In addition to reading this guide, run the Elasticsearch Health Check-Up. Detect problems and improve performance by analyzing your shard sizes, threadpools, memory, snapshots, disk watermarks and many more.
Free tool that requires no installation with +1000 users.

Run the Elasticsearch check-up to receive recommendations like this:

checklist Run Check-Up
error

The following configuration error was detected on node 123...

error-img

Description

This error can have a severe impact on your system. It's important to understand that it was caused by...

error-img

Recommendation

In order to resolve this issue and prevent it from occurring again, we recommend that you begin by changing the configuration to...

1

X-PUT curl -H "Content-Type: application/json" [customized recommendation]

Overview

It is a best practice that Elasticsearch shard size should not go above 50GB for a single shard.  

The limit for shard size is not directly enforced by Elasticsearch. However, if you go above this limit you can find that Elasticsearch is unable to relocate or recover index shards (with the consequence of possible loss of data) or you may reach the lucene hard limit of 2 ³¹ documents per index.

How to resolve this issue

If your shards are too large, then you have 3 options:

1. Delete records from the index

If appropriate for your application, you may consider permanently deleting records from your index (for example old logs or other unnecessary records).

POST /my-index/_delete_by_query
{
     "query": {
        "range" : {
            "@timestamp" : {
          
                 "lte" : now-5d

            }
        }
    }
}

2. Re-index into several small indices

The following would reindex into one index for each event_type.

POST _reindex
{

  "source": {
	"index": "logs-all-events"
  },
  "dest": {
	"index": "logs-2-"
  },
  "script": {
	"lang": "painless",
	"source": "ctx._index = 'logs-2-' + (ctx._source.event_type)"
  }
}

3. Re-index into another single index but increase the number of shards

PUT /my_new_index/_settings
{
    "index" : {
        "number_of_shards" : 2
    }
}
POST _reindex
{
  "source": {
    "index": "my_old_index"
  },
  "dest": {
    "index": "my_new_index"
  }
}

Care points for reindexing

Notice duplicate data during the reindex process

During the reindex process, if your application uses wildcards in the index name (eg. logs-*) then results may be duplicated during the reindex process since you will have two copies of the data until the process completes. This can be avoided through the use of aliases.

Notice modifications and updates during the reindex process

If data is susceptible to updates during the reindex process, then a procedure needs to be identified to capture them and ensure that the new indices reflect all modifications that are made during the transition reindexing period. For more information please refer to Opster tips on reindexing  

How to avoid the issue

There are various mechanisms to optimize shard size.

Apply ILM (index lifecycle management)

Using ILM you can get Elasticsearch to automatically create a new index when your current index size reaches a given maximum size or age.  ILM is suitable for documents which require no updates, but is not a viable option when documents require frequent updating, since to do so you need to know which of the previous index volumes contains the document you want to update.

Application strategy

You can design your application to ensure that shards are created of a sensible size:

Date based approach

myindex-yyyy.MM.dd

or myindex-yyyy.MM

This has the advantage that it is easy to delete old data simply by deleting older indices.

Attribute based approach

myindex-<clientID>

This has the advantage of allowing you to know exactly which index to search/update if you know the client ID. 

The disadvantage is that different clients may produce very different volumes of documents (and hence different shard sizes). This can be mitigated to some extent by adjusting the number of shards per client, but you may create the opposite problem (oversharding) if you have a large number of clients.  

ID range based approach

This may be possible where documents carry an ID coming from another system or application (such as a sql ID, or client ID) which enables us to distribute documents evenly.

eg. myindex-<last_digit_of_ID>

The above would give us 10 indices which will probably be evenly distributed, and also deterministic (we know which index we need to update).

Run the Elasticsearch check-up to receive recommendations like this:

checklist Run Check-Up
error

The following configuration error was detected on node 123...

error-img

Description

This error can have a severe impact on your system. It's important to understand that it was caused by...

error-img

Recommendation

In order to resolve this issue and prevent it from occurring again, we recommend that you begin by changing the configuration to...

1

X-PUT curl -H "Content-Type: application/json" [customized recommendation]





Improve Elasticsearch Performance

Run The Analysis