Elasticsearch Elasticsearch High Indexing Throttle Time

By Opster Team

Updated: Jan 28, 2024

| 2 min read

What it means

During indexing, Elasticsearch accumulates documents in memory and then writes documents to disk to create a new lucene segment. The creation of a large number of segments is inefficient, so there is a separate merge process which merges the small segments created at index time into larger segments.

However, this process takes up a lot of resources, particularly disk i/o resources. When Elasticsearch detects that the merge process cannot keep up with the rate of indexing, then it will start to throttle indexing and this will be indicated by the high value of index throttle time.

When this happens, it is likely that indexing operations will get queued and eventually indexing requests will be rejected. In the best case scenario, data will not be completely up to date, and in the worst case scenario, data may be lost if the application fails to retry the throttled write requests.

Retrying throttled requests can also put an extra burden on processors upstream since they have to hold data in a queue (if possible) and use resources retrying requests.

How to resolve

When index throttling occurs, you should try to optimize indexing using some of the following actions:

  • Reduce the index refresh rate:
PUT /my-index-000001/_settings
{
  "index" : {
    "refresh_interval" : "30s"
  }
}
PUT /my-index-000001/_settings
{
  "index" : {
    "number_of_replicas" : 0
  }
}
  • Use bulk indexing rather than individual indexing
  • Upgrade rotating disks to SSD or NVMe types
  • Optimize mappings to reduce unnecessary fields

Also note that default mappings for strings create both keyword and text fields, which is wasteful if you don’t need both.

Notes and good things to know

If you see that index throttling only occurs on some nodes and not all data nodes, this may be due to having an insufficient number of primary shards in the index to spread the indexing activity across all the nodes. This could happen if you have, for instance, one “busy” index with just one shard, which would lead to all the indexing activity becoming concentrated on the node which has the shard for this index. In this case, the solution would be to increase the number of shards in the index settings and create or roll over the index.

For additional recommendations on how to improve your indexing rate, see the full list of recommendations here.   

How helpful was this guide?

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?