Briefly, this error occurs when Elasticsearch is unable to refresh the sequence number within the specified timeout period. This could be due to heavy indexing load, slow disk I/O, or network latency. To resolve this issue, you can increase the timeout value, reduce the indexing load, optimize your disk I/O operations, or improve your network connectivity. Additionally, ensure your Elasticsearch cluster is properly sized and configured for your workload.
This guide will help you check for common problems that cause the log ” Wait for seq_no [{}] refreshed timed out [{}] ” to appear. To understand the issues related to this log, read the explanation below about the following Elasticsearch concepts: search.
Overview
Search refers to the searching of documents in an index or multiple indices. The simple search is just a GET API request to the _search endpoint. The search query can either be provided in query string or through a request body.
Examples
When looking for any documents in this index, if search parameters are not provided, every document is a hit and by default 10 hits will be returned.
GET my_documents/_search
A JSON object is returned in response to a search query. A 200 response code means the request was completed successfully.
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
...
]
}
}Notes and good things to know
- Distributed search is challenging and every shard of the index needs to be searched for hits, and then those hits are combined into a single sorted list as a final result.
- There are two phases of search: the query phase and the fetch phase.
- In the query phase, the query is executed on each shard locally and top hits are returned to the coordinating node. The coordinating node merges the results and creates a global sorted list.
- In the fetch phase, the coordinating node brings the actual documents for those hit IDs and returns them to the requesting client.
- A coordinating node needs enough memory and CPU in order to handle the fetch phase.
Log Context
Log “Wait for seq_no [{}] refreshed timed out [{}]” class name is SearchService.java. We extracted the following from Elasticsearch source code for those seeking an in-depth context :
 // index shard on timeout so that a timed-out listener does not use up any listener slots.
 final TimeValue timeout = request.getWaitForCheckpointsTimeout();
 final Scheduler.ScheduledCancellable timeoutTask = NO_TIMEOUT.equals(timeout) ? null : threadPool.schedule(() -> {
 if (isDone.compareAndSet(false; true)) {
 listener.onFailure(
 new ElasticsearchTimeoutException("Wait for seq_no [{}] refreshed timed out [{}]"; waitForCheckpoint; timeout)
 );
 }
 }; timeout; Names.SAME);  // allow waiting for not-yet-issued sequence number if shard isn't promotable to primary and the timeout is less than or equal
[ratemypost]