Elasticsearch Node Roles: How to Configure all Node Roles

By Opster Expert Team

Updated: Jan 28, 2024

| 5 min read

Quick links

Background – what are nodes?
What are node roles?
Types of node roles
How to reduce Elasticsearch costs by optimizing your node roles
Useful links
Conclusion

Background – what are nodes?

Every Elasticsearch instance we run is called a node, and multiple nodes comprise a cluster. Each node in a cluster is aware of all other nodes and forwards the requests accordingly. Clusters can consist of only a single node, though this isn’t recommended for production. In this article, we will review the different types of node roles and how to configure these roles in Elasticsearch to enable efficient full text search.

What are node roles?

The node role defines the purpose of the node and its responsibilities. We can configure multiple roles for each node based on the cluster configuration. If we don’t explicitly specify the node’s role, Elasticsearch automatically configures all roles to that node. This does not differ among the different versions of Elasticsearch.

The cluster details of such nodes will appear as:

Types of node roles

Master
Data (data_cold, data_hot, data_frozen, data_warm, data_content)
Coordinating
Ingest
Machine learning
Remote eligible
Transform

Master node

The node to which we assign a master role is called a “master” node. The master node manages all cluster operations like creating/deleting an index and it keeps track of all available nodes in the cluster. While creating shards, the master node decides the node upon which each shard should be allocated. This node will not handle any user requests.

When will the master election happen? The election process happens during startup or when the current master node goes down. Any master-eligible node except the “Voting-only” node can become a master node during the master election process.

To set node role, edit the node’s “elasticsearch.yml” and add the following line:

node.roles: [“master”]

Data node

The node to which we assign a data role is called a “data” node. A data node holds the indexed data and it takes care of CRUD, search and aggregations (operations related to the data).

Without a data node it is difficult for a cluster to operate. Seeing as all the operations carried out by data nodes are I/O, memory and CPU intensive, it is important to monitor and allocate sufficient data nodes.

There are specialized data roles like data_content, data_hot, data_cold, data_warm and data_frozen which can be used in multi-tier deployment architecture. In general it is NOT necessary to configure all of the specific roles, and you can just use the data role. If you want to configure hot cold architecture, please see this guide.

To set node role, edit the node’s “elasticsearch.yml” and add the following line:

node.roles: [“data”]

Data content node

Data content nodes are part of the content tier. These types of nodes will be used mainly to store archive and catalog data, where we might not do real-time indexing or frequent indexing like logs.

Even though these types of data will not be indexed frequently, their requirement would be to fetch results faster. To provide better search performance, these types of nodes are optimized. They prioritize query processing over usual I/O throughput, so complex searches and aggregations will be processed quickly