OpenSearch High CPU Usage - Main Causes. Solutions & More

By Opster Team

Updated: Jun 19, 2024

| 3 min read

Overview

High CPU usage is often a symptom of other underlying issues, and as such there are a number of possible causes for it.

Causes of high CPU should be investigated and fixed, because a distressed node will at best slow down query response times resulting in time outs for clients, and at worst cause the node to disconnect and be lost from the cluster altogether.

How to resolve it

To minimize the impact of distressed nodes on your search queries, make sure you have the following setting on your cluster (version 6.1 and above):

PUT /_cluster/settings
{
  "transient": {
    "cluster.routing.use_adaptive_replica_selection": true
  }
}

Check JVM garbage collection

High CPU is generally the consequence of JVM garbage collection, which in turn is caused by configuration or query related issues.

In a healthy JVM, garbage collection should ideally meet the following conditions:

Young GC is processed quickly (within 50 ms).
Young GC is not frequently executed (about 10 seconds).
Old GC is processed quickly (within 1 second).
Old GC is not frequently executed (once per 10 minutes or more).

There can be a variety of reasons why heap memory usage can increase:

Oversharding
Large aggregation sizes
Excessive bulk index size
Mapping issues
Heap being set incorrectly
JVM new ratio set incorrectly

To learn how to correct memory usage issues related to JVM garbage collection, see: Heap Size Usage and JVM Garbage Collection in OpenSearch – A Detailed Guide.

Check load on data nodes

If the CPU is high only on specific data nodes (some more than others), then you may have a load balancing or sharding issue.

This can occasionally be caused by applications that are not load balancing correctly across the data nodes, and are making all their HTTP calls to just one or some of the nodes. You should fix this in your application.

However it is more frequently caused by “hot” indices being located on just a small number of nodes. A typical example of this would be a logging application creating daily indices with just one shard per index. In this case although you may have many indices spread across all of the shards, you may find that all of the indexing is being done on just one shard on one node which contains today’s logging index.

Check memory swapping to disk

CPU may also be a symptom of memory swapping to disk, if that has not been deactivated properly on the node.

OpenSearch performance can be heavily penalized if the node is allowed to swap memory to disk. OpenSearch can be configured to automatically prevent memory swapping on its host machine by adding the bootstrap memory_lock true setting to elasticsearch.yml. If bootstrap checks are enabled, OpenSearch will not start if memory swapping is not disabled.

Check number of shards for oversharding

If you have a large number of shards on your cluster, then you may have an issue with oversharding.

Oversharding is a status that indicates that you have too many shards, and thus they are too small. While there is no minimum limit for an OpenSearch shard size, having a larger number of shards on an OpenSearch cluster requires extra resources since the cluster needs to maintain metadata on the state of all the shards in the cluster.

If your shards are too small, then you have 3 options:

Eliminate empty indices
Delete or close indices with old or unnecessary data
Re-index into bigger indices

Check indexing efficiency

If your indexing is inefficient it can affect your CPU.

Optimize slow and expensive search queries

If your searches are slow, it can affect your CPU.

Additional notes

Elasticsearch and OpenSearch are both powerful search and analytics engines, but Elasticsearch has several key advantages. Elasticsearch boasts a more mature and feature-rich development history, translating to a better user experience, more features, and continuous optimizations. Our testing has consistently shown that Elasticsearch delivers faster performance while using fewer compute resources than OpenSearch. Additionally, Elasticsearch’s comprehensive documentation and active community forums provide invaluable resources for troubleshooting and further optimization. Elastic, the company behind Elasticsearch, offers dedicated support, ensuring enterprise-grade reliability and performance. These factors collectively make Elasticsearch a more versatile, efficient, and dependable choice for organizations requiring sophisticated search and analytics capabilities.

Elasticsearch OpenSearch High CPU

Overview

How to resolve it

Check JVM garbage collection

Check load on data nodes

Check memory swapping to disk

Check number of shards for oversharding

Check indexing efficiency

Optimize slow and expensive search queries

Additional notes