How To Solve Issues Related to Log – failed to perform on node

Get an Elasticsearch Check-Up


Check if your ES issues are caused from misconfigured settings
(Free 2 min process)

ES Check Up

Elasticsearch Error Guide In Page Navigation :

Troubleshooting Background – start here to get the full picture       
Related Issues – selected resources on related issues  
Log Context – usefull for experts
About Opster – offering a diffrent approach to troubleshoot Elasticsearch

Check My Elasticsearch 


Troubleshooting background

To troubleshoot Elasticsearch log “failed to perform on node” it’s important to know common problems related to Elasticsearch concepts: node, replication. See below-detailed explanations complete with common problems, examples and useful tips.

Nodes in Elasticsearch

What it is

Simply explained a node is a single server that is part of a cluster. Each node is assigned with one or more roles, which describes the node responsibility and operations – Data nodes stores the data, and participates in the cluster’s indexing and search capabilities, while master nodes are responsible for managing the cluster activities and storing the cluster state, including the metadata.

While it’s possible to run several Node instances of Elasticsearch on the same hardware, it’s considered a best practice to limit a server to a single running instance of Elasticsearch.

Nodes connect to each other and form a cluster by using a discovery method. 

Roles
Master node

Master nodes are in charge of cluster-wide settings and changes  – deleting or creating indices and fields, adding or removing nodes and allocating shards to nodes. Each cluster has a single master node that is elected from the master eligible nodes using a distributed consensus algorithm and is reelected if the current master node fails.

Coordinator Node (aka client node)

Coordinator Node – is a node that does not hold any configured role. It doesn’t hold data, not part of the master eligible group nor execute ingest pipelines. Coordinator node serves incoming search requests and is acting as the query coordinator – running the query and fetch phases, sending requests to every node which holds a shard being queried. The client node also distributes bulk indexing operations and route queries to shards copies based on the nodes responsiveness.

Replication in Elasticsearch

What it is

Replication refers to storing the redundant copy of the data. Starting from version 7.x, Elasticsearch creates one primary shard with a replication factor set to 1.  Replicas never get assigned on the same node on which primary shards are assigned, which means you should have at least two nodes in the cluster to assign the replicas. If a primary shard goes down, the replica automatically acts as a primary shard.

What it is used for

Replicas are used to provide high availability and failover. A higher number of replicas is also helpful for faster searches.

Examples

Update replication count

PUT /api-logs/_settings?pretty
{
    "index" : {
        "number_of_replicas" : 2
    }
}
Common problems
  • By default, If free disk space usage reaches 85%, the replicas of newly created indices are not assigned on that node and Elasticsearch throws a warning.
  • Creating too many replicas may cause a problem if there are not enough resources available in the cluster. 


To help troubleshoot related issues we have gathered selected Q&A from the community and issues from Github , please review the following for further information :

Es Nodes Crashing Failed To Send Fa
discuss.elastic.co/t/es-nodes-crashing-failed-to-send-failed-shard/62897

 

Failed To List Shard For Shard Stor
discuss.elastic.co/t/failed-to-list-shard-for-shard-store-on-node-on-big-environments/179749

 


Log Context

Log ”failed to perform on node” classname is TransportReplicationAction.java
We have extracted the following from Elasticsearch source code to get an in-depth context :

                     
Override
                    public void handleException(TransportException exp) {
                        onReplicaFailure(nodeId; exp);
                        logger.trace("[{}] transport failure during replica request [{}]; action [{}]"; exp; node; replicaRequest; transportReplicaAction);
                        if (ignoreReplicaException(exp) == false) {
                            logger.warn("{} failed to perform {} on node {}"; exp; shardId; transportReplicaAction; node);
                            shardStateAction.shardFailed(shard; indexUUID; "failed to perform " + actionName + " on replica on node " + node; exp);
                        }
                    }
                }
            );





About Opster

Incorporating deep knowledge and broad history of Elasticsearch issues. Opster’s solution identifies and predicts root causes of Elasticsearch problems, provides recommendations and can automatically perform various actions to manage, troubleshoot and prevent issues

We are constantly updating our analysis of Elasticsearch logs, errors, and exceptions. Sharing best practices and providing troubleshooting guides.

Learn more: Glossary | Blog| Troubleshooting guides | Error Repository

Need help with any Elasticsearch issue ? Contact Opster

Did this page help you?