Learn how to reindex OpenSearch more efficiently and improve OpenSearch reindexing performance by following these tips:
- Disable Replicas
Disable replicas when building a new index from scratch that is not serving the search traffic. Replicas can be changed dynamically later on once re-indexing has been completed.
- Disable Refresh Interval
Disable refresh interval again. It can be changed once re-indexing has been completed.
- Use Bulk API
Use the bulk API with multiple clients to get the maximum throughput from OpenSearch (Benchmark OpenSearch cluster to avoid any performance issues).
- Increase Buffer Size
Increase index buffer size and tune it.
- Use Reindex API
If _source field is enabled and you are re-indexing in the case of changing analyzer on the existing fields (breaking changes), use Reindex API of OpenSearch.
- Disable Merge Throttling
Disable merge throttling by changing the setting `indices.store.throttle.type` to none. If you have a massive write-heavy index, then you can make it permanent.
- Ensure Optimal Scalability Settings
Choosing the optimal number of primary shards is crucial for scalability, which can’t be changed later on. Refer to Opster’s guide to shards and replicas to understand more. Also, make sure you don’t end up creating “hotspots” in the cluster.
To easily improve your indexing and search performance, we recommend you try AutoOps for OpenSearch. AutoOps detects issues and improves OpenSearch performance by analyzing shard sizes, threadpools, memory, snapshots, disk watermarks, and more. Try it for free.