Briefly, this error occurs when Elasticsearch has attempted to execute a search operation multiple times without success. This could be due to network issues, heavy load on the cluster, or a misconfiguration. To resolve this issue, you can try the following: 1) Check the network connectivity between the nodes. 2) Monitor the cluster’s health and performance to identify any bottlenecks. 3) Review the Elasticsearch configuration and ensure it’s correctly set up. 4) Increase the retry limit if the issue is due to temporary heavy load.
This guide will help you check for common problems that cause the log ” giving up on search because we retried [{}] times without success ” to appear. To understand the issues related to this log, read the explanation below about the following Elasticsearch concepts: index, search, reindex.
Overview
In Elasticsearch, an index (plural: indices) contains a schema and can have one or more shards and replicas. An Elasticsearch index is divided into shards and each shard is an instance of a Lucene index.
Indices are used to store the documents in dedicated data structures corresponding to the data type of fields. For example, text fields are stored inside an inverted index whereas numeric and geo fields are stored inside BKD trees.
Examples
Create index
The following example is based on Elasticsearch version 5.x onwards. An index with two shards, each having one replica will be created with the name test_index1
PUT /test_index1?pretty
{
    "settings" : {
        "number_of_shards" : 2,
        "number_of_replicas" : 1
    },
    "mappings" : {
        "properties" : {
            "tags" : { "type" : "keyword" },
            "updated_at" : { "type" : "date" }
        }
    }
}List indices
All the index names and their basic information can be retrieved using the following command:
GET _cat/indices?v
Index a document
Let’s add a document in the index with the command below:
PUT test_index1/_doc/1
{
  "tags": [
    "opster",
    "elasticsearch"
  ],
  "date": "01-01-2020"
}Query an index
GET test_index1/_search
{
  "query": {
    "match_all": {}
  }
}Query multiple indices
It is possible to search multiple indices with a single request. If it is a raw HTTP request, index names should be sent in comma-separated format, as shown in the example below, and in the case of a query via a programming language client such as python or Java, index names are to be sent in a list format.
GET test_index1,test_index2/_search
Delete indices
DELETE test_index1
Common problems
- It is good practice to define the settings and mapping of an Index wherever possible because if this is not done, Elasticsearch tries to automatically guess the data type of fields at the time of indexing. This automatic process may have disadvantages, such as mapping conflicts, duplicate data and incorrect data types being set in the index. If the fields are not known in advance, it’s better to use dynamic index templates.
- Elasticsearch supports wildcard patterns in Index names, which sometimes aids with querying multiple indices, but can also be very destructive too. For example, It is possible to delete all the indices in a single command using the following commands:
DELETE /*
To disable this, you can add the following lines in the elasticsearch.yml:
action.destructive_requires_name: true
Overview
Search refers to the searching of documents in an index or multiple indices. The simple search is just a GET API request to the _search endpoint. The search query can either be provided in query string or through a request body.
Examples
When looking for any documents in this index, if search parameters are not provided, every document is a hit and by default 10 hits will be returned.
GET my_documents/_search
A JSON object is returned in response to a search query. A 200 response code means the request was completed successfully.
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
...
]
}
}Notes and good things to know
- Distributed search is challenging and every shard of the index needs to be searched for hits, and then those hits are combined into a single sorted list as a final result.
- There are two phases of search: the query phase and the fetch phase.
- In the query phase, the query is executed on each shard locally and top hits are returned to the coordinating node. The coordinating node merges the results and creates a global sorted list.
- In the fetch phase, the coordinating node brings the actual documents for those hit IDs and returns them to the requesting client.
- A coordinating node needs enough memory and CPU in order to handle the fetch phase.
Overview
Reindex is the concept of copying existing data from a source index to a destination index which can be inside the same or a different cluster. Elasticsearch has a dedicated endpoint _reindex for this purpose. A reindexing is mostly required for updating mapping or settings.
Examples
Reindex data from a source index to destination index in the same cluster:
POST /_reindex?pretty
{
  "source": {
    "index": "news"
  },
  "dest": {
    "index": "news_v2"
  }
}
Notes
- Reindex API does not copy settings and mappings from the source index to the destination index. You need to create the destination index with the desired settings and mappings before you begin the reindexing process.
- The API exposes an extensive list of configuration options to fetch data from the source index, such as query-based indexing and selecting multiple indices as the source index.
- In some scenarios reindex API is not useful, where reindexing requires complex data processing and data modification based on application logic. In this case, you can write your custom script using Elasticsearch scroll API to fetch the data from source index and bulk API to index data into destination index.
Log Context
Log “giving up on search because we retried [{}] times without success” classname is RetryListener.java.
We extracted the following from Elasticsearch source code for those seeking an in-depth context :
            retryCount += 1;
            TimeValue delay = retries.next();
            logger.trace(() -> new ParameterizedMessage("retrying rejected search after [{}]"; delay); e);
            schedule(() -> retryScrollHandler.accept(this); delay);
        } else {
            logger.warn(() -> new ParameterizedMessage("giving up on search because we retried [{}] times without success"; retryCount); e);
            delegate.onFailure(e);
        }
    }
    private void schedule(Runnable runnable; TimeValue delay) {
[ratemypost]