Elasticsearch Disk Watermark

By Opster Team

Updated: Mar 22, 2023

| 2 min read

Before you dig into the details of this technical guide, have you tried asking OpsGPT?

You'll receive concise answers that will help streamline your Elasticsearch/OpenSearch operations.


Try OpsGPT now for step-by-step guidance and tailored insights into your Elasticsearch/ OpenSearch operation.

Aside from reading this guide and understanding how to fix log messages related to disk watermarks, we recommend you run the Elasticsearch Health Check-Up. It will detect issues and improve your Elasticsearch performance by analyzing your shard sizes, threadpools, memory, snapshots, disk watermarks and more.

Before you dig into the details of this guide, have you tried asking OpsGPT? You’ll receive concise answers that will help streamline your Elasticsearch/OpenSearch operations.

Try OpsGPT now for step-by-step guidance and tailored insights into your search operation.

If you’d like to learn how you can reduce the cost of your cluster by adjusting your disk watermarks, run Opster’s Cost Insight tool.

Overview

There are various “watermark” thresholds on your Elasticsearch cluster. As the disk fills up on a node, the first threshold to be crossed will be the “low disk watermark”. The second threshold will then be the “high disk watermark threshold”. Finally, the “disk flood stage” will be reached. Once this threshold is passed, the cluster will then block writing to ALL indices that have one shard (primary or replica) on the node which has passed the watermark. Reads (searches) will still be possible.

Relevant settings

cluster.routing.allocation.disk.watermark have three thresholds of watermarks, it accepts absolute values as well as percentage values. The three watermarks are:

  1. Low disk watermark
  2. High disk watermark
  3. Flood stage disk watermark

Permanent fixes

1. Delete unused indices

2. Merge segments to reduce the size of the shard on the affected node

3. Attach external disk or increase the disk used by the data node

Temporary hacks/fixes

1. Changed these settings values to a higher threshold by dynamically update settings using below update cluster API.

PUT _cluster/settings :

{

  “transient”: {

    “cluster.routing.allocation.disk.watermark.low”: “100gb”, –>adjust according to your situations

    “cluster.routing.allocation.disk.watermark.high”: “50gb”,

    “cluster.routing.allocation.disk.watermark.flood_stage”: “10gb”,

    “cluster.info.update.interval”: “1m”

  }

}

2. Disable disk check by hitting below cluster update API

{

    “transient”: {

       “cluster.routing.allocation.disk.threshold_enabled” : false

    }

}

Even After all these fixes, Elasticsearch won’t bring indices in write mode for that this API needs to be activated

PUT _all/_settings

{

     “index.blocks.read_only_allow_delete”: null

}

How helpful was this guide?

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?


Get expert answers on Elasticsearch/OpenSearch