How To Solve Issues Related to Log – Cluster state applier task took above the warn threshold of

Prevent Your Next ELK Incident

Try our free Check Up to test if your ES issues are caused from misconfigured settings

Fix Issue

Updated: Jan-20

In-Page Navigation (click to jump) :

Opster Offer’s World-Class Elasticsearch Expertise In One Powerful Product
Try Our Free ES Check-Up   Prevent Incident

Troubleshooting background

To troubleshoot Elasticsearch log “Cluster state applier task took above the warn threshold of” it’s important to understand common problems related to Elasticsearch concepts: cluster, task, threshold. See detailed explanations below complete with common problems, examples and useful tips.

Task in Elasticsearch

What it is

A task is equivalent to an Elasticsearch operation, which can be any request performed on an Elasticsearch cluster. For example, a delete by query request, a search request and so on. Elasticsearch provides a dedicated Task API for the task management which includes various actions, from retrieving the status of current running tasks to canceling any long running task.

Examples
Get all currently running tasks on all nodes of the cluster

Apart from other information, the response of the below request contains task IDs of all the tasks which can be used to get detailed information about the particular task in question.

GET _tasks
GET detailed information of a particular task

clQFAL_VRrmnlRyPsu_p8A:1132678759 is the ID of the task in below request.

GET _tasks/clQFAL_VRrmnlRyPsu_p8A:1132678759
Get all the current tasks running on particular nodes
GET _tasks?nodes=nodeId1,nodeId2
Cancel a long-running task

clQFAL_VRrmnlRyPsu_p8A:1132678759 is the ID of the task in the below request.

POST /_tasks/clQFAL_VRrmnlRyPsu_p8A:1132678759/_cancel?pretty
Notes
  • The Task API will be most useful when you want to investigate the spike of resource utilization in the cluster or want to cancel an operation.

Threshold in Elasticsearch

What it is

Elasticsearch uses several parameters to enable it to manage hard disk storage across the cluster. 

What it’s used for
  • Elasticsearch will actively try to relocate shards away from nodes which exceed the disk watermark high threshold.
  • Elasticsearch will NOT locate new shards or relocate shards on to nodes which exceed the disk watermark low threshold.
  • Elasticsearch will prevent all writes to an index which has any shard on a node that exceeds the disk.watermark.flood_stage threshold.
  • The info update interval is the time it will take Elasticsearch to re-check the disk usage
Examples
PUT _cluster/settings
{
  "transient": {
   
    "cluster.routing.allocation.disk.watermark.low": "85%",
    "cluster.routing.allocation.disk.watermark.high": "90%",
    "cluster.routing.allocation.disk.watermark.flood_stage": "95%",
    "cluster.info.update.interval": "1m"
  }
}
Notes and good things to know:
  • ou can use absolute values eg.”100gb” or percentages eg. “90%”, but you cannot mix the two on the same cluster. 
  • In general, it is recommended to use percentages, since this will work in cases where disks are resized.
  • You can put the cluster settings on the elasticsearch.yml on each node,  but it is recommended to use the PUT _cluster/settings API because it is easier to manage, and ensures that the settings are coherent across the cluster.
  • Elasticsearch comes with sensible defaults for these settings, so think twice before modifying them.  If you find you are spending a lot of time fine-tuning these settings, then it is probably time to invest in new disk space.
  • In the event of the flood_stage.the threshold being exceeded, once you delete data, Elasticsearch should detect automatically that the block can be released (bearing in mind the update interval which could be, for instance, a minute).  However if you want to accelerate this process, you can unblock an index manually, with the following call 
PUT /my_index/_settings
{
  "index.blocks.read_only_allow_delete": null
}
Common problems

Inappropriate cluster settings (if the disk watermark.low is too low) can make it impossible for Elasticsearch to allocate shards on the cluster.  In particular, bear in mind that these parameters work in combination with other cluster settings (for example shard allocation awareness) which cause further restraints on how elasticsearch can allocate shards.


To help troubleshoot related issues we have gathered selected Q&A from the community and issues from Github , please review the following for further information :

1 Lots Of Cluster State Update Task Z  

2Timed Out Waiting For All Nodes To  

Timetaken By A Cluster State Update


Log Context

Log ”Cluster state applier task [{}] took [{}] above the warn threshold of {}” classname is ClusterApplierService.java
We have extracted the following from Elasticsearch source code to get an in-depth context :

         }
    }

    protected void warnAboutSlowTaskIfNeeded(TimeValue executionTime; String source) {
        if (executionTime.getMillis() > slowTaskLoggingThreshold.getMillis()) {
            logger.warn("cluster state applier task [{}] took [{}] above the warn threshold of {}"; source; executionTime;
                slowTaskLoggingThreshold);
        }
    }

    class NotifyTimeout implements Runnable {






About Opster

Opster identifies and predicts root causes of Elasticsearch problems, provides recommendations and can automatically perform various actions to prevent issues, optimize performance and save resources.

Learn more: Glossary | Blog| Troubleshooting guides | Error Repository

Need help with any Elasticsearch issue ? Contact Opster