What is Safe Drain?
Safe drain is a mechanism built into OMC that enables nodes to be safely removed from a cluster. When a node is removed from a cluster, the safe drain process ensures that all its data is transferred to other nodes in the cluster before the node is taken offline. This prevents any data loss or corruption that could occur if the node were simply shut down or disconnected without first transferring its data to other nodes.
During the safe drain process, the node being removed is marked as “draining,” which means that it no longer receives any new requests. Instead, it only processes outstanding requests, until its workload has been completed. Once all requests have been processed, the node will begin transferring its data to other nodes in the cluster. The safe drain process continues until all data has been transferred and the node is no longer part of the cluster. Only then, will the OMC turn down the node.
Why is Safe Drain Important?
Safe drain is an important OMC feature because it helps ensure the stability and reliability of the cluster. If a node were to be abruptly disconnected from the cluster, without transferring its data first, the risk of data loss or corruption is high. This could result in serious issues like index corruption, search queries returning incorrect or incomplete results, and even downtime for the entire cluster.
Using safe drain enables nodes to be safely removed from the cluster without any risk of data loss or corruption. This ensures that the cluster remains stable and reliable, even when nodes need to be taken offline for maintenance or other reasons.
How to Use Safe Drain
Using safe drain in OpenSearch is relatively simple, the only thing that users need to do is declare DrainDataNodes to true while creating the cluster, the rest will happen automatically when clusters are scaled down. Note that it can also be turned in after the cluster creation in the Setting phase of a running cluster. The DrainDataNodes feature only works on nodes that singed as DATA roles nodes.
We highly recommend using the safe drain feature on production clusters, it should always be used when removing nodes from an OpenSearch cluster. Failure to use safe drain could result in serious issues.