Exception during cleanup of stale shard blobs – How to solve this OpenSearch error

Opster Team

Aug-23, Version: 2.2-2.9

Before you dig into reading this guide, have you tried asking OpsGPT what this log means? You’ll receive a customized analysis of your log.

Try OpsGPT now for step-by-step guidance and tailored insights into your OpenSearch operation.

Briefly, this error occurs when OpenSearch encounters issues while trying to clean up old or “stale” shard blobs, which are data fragments. This could be due to insufficient permissions, disk space issues, or network connectivity problems. To resolve this, you can try increasing disk space, checking and adjusting the permissions for the OpenSearch process, or ensuring stable network connectivity. Additionally, you may need to manually delete the stale shard blobs if the automatic cleanup process continues to fail.

For a complete solution to your to your search operation, try for free AutoOps for Elasticsearch & OpenSearch . With AutoOps and Opster’s proactive support, you don’t have to worry about your search operation – we take charge of it. Get improved performance & stability with less hardware.

This guide will help you check for common problems that cause the log ” [{}] Exception during cleanup of stale shard blobs ” to appear. To understand the issues related to this log, read the explanation below about the following OpenSearch concepts: blobstore, repositories, shard.

Log Context

Log “[{}] Exception during cleanup of stale shard blobs” classname is BlobStoreRepository.java.
We extracted the following from OpenSearch source code for those seeking an in-depth context :

        } catch (Exception e) {
            // TODO: We shouldn't be blanket catching and suppressing all exceptions here and instead handle them safely upstream.
            // Currently this catch exists as a stop gap solution to tackle unexpected runtime exceptions from implementations
            // bubbling up and breaking the snapshot functionality.
            assert false : e;
            logger.warn(new ParameterizedMessage("[{}] Exception during cleanup of stale shard blobs"; snapshotIds); e);
            listener.onFailure(e);
        }
    }

    // When remoteStoreLockManagerFactory is non-null; while deleting the files; lock files are also released before deletion of respective

 

How helpful was this guide?

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?

Get expert answers on Elasticsearch/OpenSearch