How To Solve Issues Related to Log – failed to perform on node

Prevent Your Next ELK Incident

Try our free Check Up to test if your ES issues are caused from misconfigured settings

Fix Issue

Updated: Feb-20

In Page Navigation (click to jump) :
Troubleshooting Background       
Related Issues  
Log Context
About Opster

Opster Offer’s World-Class Elasticsearch Expertise In One Powerful Product
Try Our Free ES Check-Up   Prevent Incident

Troubleshooting background

To troubleshoot Elasticsearch log “failed to perform on node” it’s important to understand common problems related to Elasticsearch concepts: node, replication. See detailed explanations below complete with common problems, examples and useful tips.

Nodes in Elasticsearch

What it is

Simply explained a node is a single server that is part of a cluster. Each node is assigned with one or more roles, which describes the node responsibility and operations – Data nodes stores the data, and participates in the cluster’s indexing and search capabilities, while master nodes are responsible for managing the cluster activities and storing the cluster state, including the metadata.

While it’s possible to run several Node instances of Elasticsearch on the same hardware, it’s considered a best practice to limit a server to a single running instance of Elasticsearch.

Nodes connect to each other and form a cluster by using a discovery method. 

Roles
Master node

Master nodes are in charge of cluster-wide settings and changes  – deleting or creating indices and fields, adding or removing nodes and allocating shards to nodes. Each cluster has a single master node that is elected from the master eligible nodes using a distributed consensus algorithm and is reelected if the current master node fails.

Coordinator Node (aka client node)

Coordinator Node – is a node that does not hold any configured role. It doesn’t hold data, not part of the master eligible group nor execute ingest pipelines. Coordinator node serves incoming search requests and is acting as the query coordinator – running the query and fetch phases, sending requests to every node which holds a shard being queried. The client node also distributes bulk indexing operations and route queries to shards copies based on the nodes responsiveness.

Replication in Elasticsearch

What it is

Replication refers to storing the redundant copy of the data. Starting from version 7.x, Elasticsearch creates one primary shard with a replication factor set to 1.  Replicas never get assigned on the same node on which primary shards are assigned, which means you should have at least two nodes in the cluster to assign the replicas. If a primary shard goes down, the replica automatically acts as a primary shard.

What it is used for

Replicas are used to provide high availability and failover. A higher number of replicas is also helpful for faster searches.

Examples

Update replication count

PUT /api-logs/_settings?pretty
{
    "index" : {
        "number_of_replicas" : 2
    }
}
Common problems
  • By default, If free disk space usage reaches 85%, the replicas of newly created indices are not assigned on that node and Elasticsearch throws a warning.
  • Creating too many replicas may cause a problem if there are not enough resources available in the cluster. 


To help troubleshoot related issues we have gathered selected Q&A from the community and issues from Github , please review the following for further information :

1 Es Nodes Crashing Failed To Send Fa  

2Failed To List Shard For Shard Stor  


Log Context

Log ”{} failed to perform {} on node {}” classname is TransportReplicationAction.java
We have extracted the following from Elasticsearch source code to get an in-depth context :

                     
Override
                    public void handleException(TransportException exp) {
                        onReplicaFailure(nodeId; exp);
                        logger.trace("[{}] transport failure during replica request [{}]; action [{}]"; exp; node; replicaRequest; transportReplicaAction);
                        if (ignoreReplicaException(exp) == false) {
                            logger.warn("{} failed to perform {} on node {}"; exp; shardId; transportReplicaAction; node);
                            shardStateAction.shardFailed(shard; indexUUID; "failed to perform " + actionName + " on replica on node " + node; exp);
                        }
                    }
                }
            );





About Opster

Incorporating deep knowledge and broad history of Elasticsearch issues. Opster’s solution identifies and predicts root causes of Elasticsearch problems, provides recommendations and can automatically perform various actions to manage, troubleshoot and prevent issues.

Learn more: Glossary | Blog| Troubleshooting guides | Error Repository

Need help with any Elasticsearch issue ? Contact Opster

Did this page help you?