Elasticsearch OpenSearch Shards Too Large

By Opster Team

Updated: Jun 27, 2023

| 2 min read

Overview

It is a best practice that OpenSearch shard size should not go above 50GB for a single shard.  

The limit for shard size is not directly enforced by OpenSearch. However, if you go above this limit you can find that OpenSearch is unable to relocate or recover index shards (with the consequence of possible loss of data) or you may reach the lucene hard limit of 2 ³¹ documents per index.

How to resolve this issue

If your shards are too large, then you have 3 options:

1. Delete records from the index

If appropriate for your application, you may consider permanently deleting records from your index (for example old logs or other unnecessary records).

POST /my-index/_delete_by_query
{
     "query": {
        "range" : {
            "@timestamp" : {
          
                 "lte" : now-5d

            }
        }
    }
}

2. Re-index into several small indices

The following would reindex into one index for each event_type.

POST _reindex
{

  "source": {
	"index": "logs-all-events"
  },
  "dest": {
	"index": "logs-2-"
  },
  "script": {
	"lang": "painless",
	"source": "ctx._index = 'logs-2-' + (ctx._source.event_type)"
  }
}

3. Re-index into another single index but increase the number of shards

PUT /my_new_index/_settings
{
    "index" : {
        "number_of_shards" : 2
    }
}
POST _reindex
{
  "source": {
    "index": "my_old_index"
  },
  "dest": {
    "index": "my_new_index"
  }
}

Care points for reindexing

Notice duplicate data during the reindex process

During the reindex process, if your application uses wildcards in the index name (eg. logs-*) then results may be duplicated during the reindex process since you will have two copies of the data until the process completes. This can be avoided through the use of aliases.

Notice modifications and updates during the reindex process

If data is susceptible to updates during the reindex process, then a procedure needs to be identified to capture them and ensure that the new indices reflect all modifications that are made during the transition reindexing period. For more information please refer to Opster tips on reindexing (where the same principles apply).

How to avoid the issue

There are various mechanisms to optimize shard size.

Apply ISM (Index State Management)

Using ISM you can get OpenSearch to automatically create a new index when your current index size reaches a given maximum size or age.  ISM is suitable for documents which require no updates, but is not a viable option when documents require frequent updating, since to do so you need to know which of the previous index volumes contains the document you want to update.

Application strategy

You can design your application to ensure that shards are created of a sensible size:

Date based approach

myindex-yyyy.MM.dd

or myindex-yyyy.MM

This has the advantage that it is easy to delete old data simply by deleting older indices.

Attribute based approach

myindex-<clientID>

This has the advantage of allowing you to know exactly which index to search/update if you know the client ID. 

The disadvantage is that different clients may produce very different volumes of documents (and hence different shard sizes). This can be mitigated to some extent by adjusting the number of shards per client, but you may create the opposite problem (oversharding) if you have a large number of clients.  

ID range based approach

This may be possible where documents carry an ID coming from another system or application (such as a sql ID, or client ID) which enables us to distribute documents evenly.

eg. myindex-<last_digit_of_ID>

The above would give us 10 indices which will probably be evenly distributed, and also deterministic (we know which index we need to update).

How to reduce your OpenSearch costs by optimizing your shards

Watch the video below to learn how to save money on your deployment by optimizing your shards.

How helpful was this guide?

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?