Before you begin reading this guide, we recommend you run Elasticsearch Error Check-Up which analyzes 2 JSON files to detect many errors.
Briefly, this error occurs when Elasticsearch times out when executing a DeleteByQuery request, which is a request to delete documents from an index that match a particular query. The cause of this error can be due to long-running queries or issues with the hardware, like insufficient resources or a slow network. To resolve this error, you can try to optimize the query or increase Elasticsearch’s resource allocation, like memory or CPU. You can also use the scroll API to break the query into smaller pieces and execute it in batches, which can help prevent timeouts.
To easily locate the root cause and resolve this issue try AutoOps for Elasticsearch & OpenSearch. It diagnoses problems by analyzing hundreds of metrics collected by a lightweight agent and offers guidance for resolving them.
This guide will help you check for common problems that cause the log ” DeleteByQuery for state timed out ” to appear. To understand the issues related to this log, read the explanation below about the following Elasticsearch concepts: delete, delete-by-query and plugin.
Overview
DELETE is an Elasticsearch API which removes a document from a specific index. This API requires an index name and _id document to delete the document.
Delete a document
DELETE /my_index/_doc/1
Notes
- A delete request throws 404 error code if the document does not already exist in the index.
- If you want to delete a set of documents that matches a query, you need to use delete by query API.
Overview
Delete-by-query is an Elasticsearch API, which was introduced in version 5.0 and provides functionality to delete all documents that match the provided query. In lower versions, users had to install the Delete-By-Query plugin and use the DELETE /_query endpoint for this same use case.
What it is used for
This API is used for deleting all the documents from indices based on a query. Once the query is executed, Elasticsearch runs the process in the background to delete all the matching documents so you don’t have to wait for the process to be completed.
Examples
Delete all the documents of an index without deleting the mapping and settings:
POST /my_index/_delete_by_query?conflicts=proceed&pretty { "query": { "match_all": {} } }
The conflict parameter in the request is used to proceed with the request even in the case of version conflicts for some documents. The default conflict behavior is to abort the request altogether.
Notes
- A long-running delete_by_query can be terminated using _task API.
- Inside the query body, you can use the same syntax for queries that are available under the _search API.
Common problems
Elasticsearch takes a snapshot of the index when you hit delete by query request and uses the _version of the documents to process the request. If a document gets updated in the meantime, it will result in a version conflict error and the delete operation will fail.
Overview
A plugin is used to enhance the core functionalities of Elasticsearch. Elasticsearch provides some core plugins as a part of their release installation. In addition to those core plugins, it is possible to write your own custom plugins as well. There are several community plugins available on GitHub for various use cases.
Examples
Get all of the instructions for the plugin:
sudo bin/elasticsearch-plugin -h
Installing the S3 plugin for storing Elasticsearch snapshots on S3:
sudo bin/elasticsearch-plugin install repository-s3
Removing a plugin:
sudo bin/elasticsearch-plugin remove repository-s3
Installing a plugin using the file’s path:
sudo bin/elasticsearch-plugin install file:///path/to/plugin.zip
Notes and good things to know
- Plugins are installed and removed using the elasticsearch-plugin script, which ships as a part of the Elasticsearch installation and can be found inside the bin/ directory of the Elasticsearch installation path.
- A plugin has to be installed on every node of the cluster and each of the nodes has to be restarted to make the plugin visible.
- You can also download the plugin manually and then install it using the elasticsearch-plugin install command, providing the file name/path of the plugin’s source file.
- When a plugin is removed, you will need to restart every Elasticsearch node in order to complete the removal process.
Common issues
- Managing permission issues during and after plugin installation is the most common problem. If Elasticsearch was installed using the DEB or RPM packages then the plugin has to be installed using the root user. Otherwise you can install the plugin as the user that owns all of the Elasticsearch files.
- In the case of DEB or RPM package installation, it is important to check the permissions of the plugins directory after you install it. You can update the permission if it has been modified using the following command:
chown -R elasticsearch:elasticsearch path_to_plugin_directory
- If your Elasticsearch nodes are running in a private subnet without internet access, you cannot install a plugin directly. In this case, you can simply download the plugins and copy the files inside the plugins directory of the Elasticsearch installation path on every node. The node has to be restarted in this case as well.
Log Context
Log “[{}] DeleteByQuery for state timed out” classname is TransportDeleteDataFrameAnalyticsAction.java.
We extracted the following from Elasticsearch source code for those seeking an in-depth context :
// Step 3. Delete the config ActionListenerdeleteStateHandler = ActionListener.wrap( bulkByScrollResponse -> { if (bulkByScrollResponse.isTimedOut()) { logger.warn("[{}] DeleteByQuery for state timed out"; id); } if (bulkByScrollResponse.getBulkFailures().isEmpty() == false) { logger.warn("[{}] {} failures and {} conflicts encountered while runnint DeleteByQuery for state"; id; bulkByScrollResponse.getBulkFailures().size(); bulkByScrollResponse.getVersionConflicts()); for (BulkItemResponse.Failure failure : bulkByScrollResponse.getBulkFailures()) {