Opster Team
To avoid receiving the log “now throttling indexing for shard: segment writing can’t keep up” in the future, we recommend you try running the Elasticsearch Error Check-Up which can resolve issues that cause many errors.
This guide will help you check for common problems that cause the log to appear. It’s important to understand the issues related to it, so to get started, read the general overview on common issues and tips related to the Elasticsearch concepts: indexing, indices, memory and shard.
Quick overview
This log means that Elasticsearch is putting back-pressure on the indexing process. It is essential to look at this log and take appropriate actions to ensure ES doesn’t crash. Note that this is not an error message. Make sure:
- Elasticsearch can cope with your indexing requirements.
- Search performs in near real-time, as delay in indexing leads to documents becoming unavailable for search and leads to poor search and user experience.
Here are some important terms to help you understand this log message:
Shard
Each Elasticsearch index is made up of one or more shards, and each shard is an Apache Lucene index. Elasticsearch internally uses Lucene to index and search.
Segments
Every shard is made of multiple segments, and these segments are immutable for better performance (they can be cached and used in a multithreading environment).
Indexing buffer
Each index request (a document that needs to be indexed) is first written in memory called indexing buffer (every shard has its indexing buffer), and from buffer it writes to segments. Writing in the segment means writing on the disk.
Log message deep-dive
Elasticsearch reserves the 10% (default) of total heap for all indexing activities, which is shared across all the indexing shards on the nodes.
indices.memory.index_buffer_size controls the above.
It also maintains a set of throttled shards for which it can’t keep up with segment writes, and therefore it throttles the indexing.
When a particular shard takes more memory to write the segments, Elasticsearch puts that shard into the throttled set and stops the indexing for that shard, which is then logged in the same log message:
l"now throttling indexing for shard [{}]: segment writing can't keep up", largest.shard.shardId());
Troubleshooting steps
- Figure out which shard and index it’s throttling: shard ID is mentioned in the log message and using _cat/shards API you can locate the problematic index, shard and data node.
- Once you figure out the problematic index, based on your requirements you can fine-tune its indexing requirement. For example, if it’s bulk migration, disable refresh_interval.
- If there is far less search traffic and yours is a write-heavy system, then you can allocate more heap for indexing by increasing the default % of indices.memory.index_buffer_size to a bigger size.
- You can increase your RAM of data nodes and increase JVM of Elasticsearch to max 31GB for optimal performance.
Note: For detailed troubleshooting, you can contact Opster’s community support team, as this process can require more details and live debugging.

Index and indexing in Elasticsearch - 3 min
Overview
Indexing is the process of adding documents to and updating documents on an Elasticsearch index.
Examples
In its simplest form, you can index a document like this:
POST /test/_doc { "message": "Opster Rocks Elasticsearch Management" }
This will create the index “test” (if it doesn’t already exist) and add a document with the source equal to the body of the POST call. In this case, the ID will be created automatically. If you repeat this command, a second document will be created with an identical source but a different ID.
Alternatively, you can do this:
PUT /test/_doc/1 { "message": "Opster Elasticsearch Management and Troubleshooting" }
This is almost the same, but in this case, the call sets the ID of the document to 1. If you repeat the command modifying the message, you will modify the original document, replacing the previous source with the latest source.
However note that this is NOT the same as an UPDATE operation, which is a different API and allows us to modify certain fields of the document while leaving others unchanged.
Notes and good things to know
You can set your own ID if necessary (especially if you later need to update the same ID) but this comes at a performance penalty. If you don’t need to update documents, then let Elasticsearch set its own ID automatically.
If you need to index many documents at once, it is much more efficient to use the BULK API to carry out these operations with a single call.
Indexing is not an immediate automatic process. Documents will not be available for search until the index has refreshed. Refresh time by default is 1 second. Increasing this time reduces the burden on the cluster of indexing, increasing indexing speed. It is possible to modify the refresh time in the index settings.
You can apply version control by setting the version parameter (?version=3) and indicating version_type=external. By doing this Elasticsearch will reject any index requests where the version specified is less than the current version. This can be useful when running distributed processes and you cannot guarantee that updated documents arrive in the correct order.
PUT test/_doc/1?version=20&version_type=external { "message" : "using external version the document will be modified only if version is greater than previous!" }
The process of indexing is as follows
The index request is sent to the primary shard. Once the primary shard is updated, then the replication process request will be relayed to the replica shards. The command will not return until the primary shard (at least) has been updated. For greater resilience, you can specify a minimum number of shard replicas to be available before proceeding with the operation by using the parameter ?wait_for_active_shards=2
You can also specify which specific shard the index operation is sent to by using the “routing” command. There are 2 reasons that this might be done:
- Certain Elasticsearch functions (parent-child documents) that require that the parent and child documents be held on the same shard.
- Secondly, it may be possible to increase search speeds and reduce load on Elasticsearch by storing similar documents together on the same shard and then specifying the routing for both indexing and searching. Although this can be done explicitly during indexing, it is not recommended. It would be preferable to set this up using the index mapping, so that the routing is determined by an ID value on the source document.

Index and indexing in Elasticsearch - 3 min
Overview
In Elasticsearch, an index (plural: indices) contains a schema and can have one or more shards and replicas. An Elasticsearch index is divided into shards and each shard is an instance of a Lucene index.
Indices are used to store the documents in dedicated data structures corresponding to the data type of fields. For example, text fields are stored inside an inverted index whereas numeric and geo fields are stored inside BKD trees.
Examples
Create index
The following example is based on Elasticsearch version 5.x onwards. An index with two shards, each having one replica will be created with the name test_index1
PUT /test_index1?pretty { "settings" : { "number_of_shards" : 2, "number_of_replicas" : 1 }, "mappings" : { "properties" : { "tags" : { "type" : "keyword" }, "updated_at" : { "type" : "date" } } } }
List indices
All the index names and their basic information can be retrieved using the following command:
GET _cat/indices?v
Index a document
Let’s add a document in the index with the command below:
PUT test_index1/_doc/1 { "tags": [ "opster", "elasticsearch" ], "date": "01-01-2020" }
Query an index
GET test_index1/_search { "query": { "match_all": {} } }
Query multiple indices
It is possible to search multiple indices with a single request. If it is a raw HTTP request, index names should be sent in comma-separated format, as shown in the example below, and in the case of a query via a programming language client such as python or Java, index names are to be sent in a list format.
GET test_index1,test_index2/_search
Delete indices
DELETE test_index1
Common problems
- It is good practice to define the settings and mapping of an Index wherever possible because if this is not done, Elasticsearch tries to automatically guess the data type of fields at the time of indexing. This automatic process may have disadvantages, such as mapping conflicts, duplicate data and incorrect data types being set in the index. If the fields are not known in advance, it’s better to use dynamic index templates.
- Elasticsearch supports wildcard patterns in Index names, which sometimes aids with querying multiple indices, but can also be very destructive too. For example, It is possible to delete all the indices in a single command using the following commands:
DELETE /*
To disable this, you can add the following lines in the elasticsearch.yml:
action.destructive_requires_name: true

Log Context
Log “Now throttling indexing for shard [{}]: segment writing can’t keep up” classname is IndexingMemoryController.java.
We extracted the following from Elasticsearch source code for those seeking an in-depth context :
logger.debug("write indexing buffer to disk for shard [{}] to free up its [{}] indexing buffer"; largest.shard.shardId(); new ByteSizeValue(largest.bytesUsed)); writeIndexingBufferAsync(largest.shard); totalBytesUsed -= largest.bytesUsed; if (doThrottle && throttled.contains(largest.shard) == false) { logger.info("now throttling indexing for shard [{}]: segment writing can't keep up"; largest.shard.shardId()); throttled.add(largest.shard); activateThrottling(largest.shard); } } }
Find & fix Elasticsearch problems
Opster AutoOps diagnoses & fixes issues in Elasticsearch based on analyzing hundreds of metrics.
Fix Your Cluster IssuesConnect in under 2 minutes
Lourens Rozema
CTO at Omnidots