Before you begin reading this guide, we recommend you try running the Elasticsearch Error Check-Up which analyzes 2 JSON files to detect many configuration errors.
To easily locate the root cause and resolve this issue try AutoOps for Elasticsearch & OpenSearch. It diagnoses problems by analyzing hundreds of metrics collected by a lightweight agent and offers guidance for resolving them.
This guide will help you check for common problems that cause the log ” Invalid alias filter ” to appear. To understand the issues related to this log, read the explanation below about the following Elasticsearch concepts: filter, search and alias.
Overview
A filter in Elasticsearch is all about applying some conditions inside the query that are used to narrow down the matching result set.
What it is used for
When a query is executed, Elasticsearch by default calculates the relevance score of the matching documents. But in some conditions it does not require scores to be calculated, for instance if a document falls in the range of two given timestamps. For all these Yes/No criteria, a filter clause is used.
Examples
Return all the results of a given index that falls between a date range:
GET my_index/_search { "query": { "bool": { "filter": { "range": { "created_at": { "gte": "2020-01-01", "lte": "2020-01-10" } } } } } }
Notes
- Queries are used to find out how relevant a document is to a particular query by calculating a score for each document, whereas filters are used to match certain criteria and are cacheable to enable faster execution.
- Filters do not contribute to scoring and thus are faster to execute.
- There are major changes introduced in Elasticsearch version 2.x onward related to how query and filters are written and performed internally.
Common problems
- The most common problem with filters is incorrect use inside the query. If filters are not used correctly, query performance can be significantly affected. So filters must be used wherever there is scope of not calculating the score.
- Another problem often arises when using date range filters, if “now” is used to represent the current time. It has to be noted that “now” is continuously changing the timestamp and thus Elasticsearch cannot use caching of the response since the data set will keep changing.
Overview
Search refers to the searching of documents in an index or multiple indices. The simple search is just a GET API request to the _search endpoint. The search query can either be provided in query string or through a request body.
Examples
When looking for any documents in this index, if search parameters are not provided, every document is a hit and by default 10 hits will be returned.
GET my_documents/_search
A JSON object is returned in response to a search query. A 200 response code means the request was completed successfully.
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 1.0, "hits" : [ ... ] } }
Notes and good things to know
- Distributed search is challenging and every shard of the index needs to be searched for hits, and then those hits are combined into a single sorted list as a final result.
- There are two phases of search: the query phase and the fetch phase.
- In the query phase, the query is executed on each shard locally and top hits are returned to the coordinating node. The coordinating node merges the results and creates a global sorted list.
- In the fetch phase, the coordinating node brings the actual documents for those hit IDs and returns them to the requesting client.
- A coordinating node needs enough memory and CPU in order to handle the fetch phase.
Quick links:
- Overview – Data Stream Aliases and Index Aliases
- Creating and removing aliases
- Alias use cases
- Notes and common problems
Overview
In Elasticsearch, an alias is a secondary name given that refers to a group of data streams or indices. Aliases can be created and removed dynamically using _aliases REST endpoint.
There are two types of aliases:
- Data Stream Aliases: An alias for a data stream refers to one or more data streams. The names of data streams will be referred to by data stream aliases. In the cluster state, data stream aliases are kept distinct from data streams.
- Index Aliases: An index alias points to one or more indices.
The master node manages the cluster state, which includes aliases.
Creating and removing aliases
Creating an alias on a single index:
POST /_aliases?pretty { "actions": [ { "add": { "index": "index_1", "alias": "alias_1" } } ] }
Creating an alias that is tied to more than one index:
POST /_aliases?pretty { "actions": [ { "add": { "index": "index_1", "alias": "alias_1" } }, { "add": { "index": "index_2", "alias": "alias_1" } } ] }
Listing out all of the available aliases in an Elasticsearch cluster:
GET _cat/aliases
Removing an alias:
POST /_aliases?pretty { "actions": [ { "remove": { "index": "index_2", "alias": "alias_1" } } ] }
Alias use cases
Aliases are used for multiple purposes such as to search across more than one index with a single name, perform the reindexing process with zero downtime and query data based on predefined filters. Below are 6 different use cases for aliases.
1. Filter-based aliases to limit access to data
One use case is making a filter-based alias, which is quite useful when you need to limit access to data. When a query is executed, an alias can apply a filter automatically.
For example, consider an index named `opster-idx`, having an alias that points to the groups that contain the `opster` company, so you can create an alias that handles this filtering automatically as shown in the steps below.
Index documents:
PUT /opster-idx/_doc/1 { "title": "Taking Care of Your Entire Search Operation", "company": "Opster" }
PUT /opster-idx/_doc/2 { "title": "Streaming service", "company": "XYZ" }
Add the query to the actions in the `filter` param to create a filter alias. The query here is used to limit the documents that the alias can access.
POST /_aliases?pretty { "actions": [ { "add": { "index": "opster-idx", "alias": "opster-alias", "filter": { "term": { "company": "opster" } } } } ] }
While you perform a match-all query on `opster-alias`, only the documents that match the term query (which was added when building filter alias, i.e. documents with company name equal to `opster`) will appear in the search results.
Search query:
GET opster-alias/_search { "query":{ "match_all":{} } }
Search response:
"hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 1.0, "hits": [ { "_index": "opster-idx", "_id": "1", "_score": 1.0, "_source": { "title": "Taking Care of Your Entire Search Operation", "company": "Opster" } } ] }
Now if you try to query those documents that have the company name, “XYZ”, as follows:
GET opster-alias/_search { "query": { "match": { "company": "XYZ" } } }
The search response will be:
"hits": { "total": { "value": 0, "relation": "eq" }, "max_score": null, "hits": [] }
2. Combining routing with aliases
When searching and indexing, the following example will filter out the company `opster` and add `1` to the route to limit where searches are done.
Routing is a string value that is used to route indexing and search operations to a specific shard.
POST /_aliases?pretty { "actions": [ { "add": { "index": "index_1", "alias": "alias_2", "filter": { "term": { "company": "opster" } }, "routing": "1" } } ] }
3. Transitioning to new indices
You can use aliases when your application needs to seamlessly transition from an old index to a new index with no downtime.
4. Creating sliding windows into distinct indices
For example, if you construct daily indices for your data, you might wish to establish an alias named `last-7-days` to produce a sliding window of the data from the previous seven days. Then, each day, when you create a new daily index, you may add it to the alias and delete the 8-day old index at the same time.
5. Aliases and ILM for updating or deleting documents
You can use an index alias and index template with ILM to manage and roll over the alias’s indices if you need to update or delete existing documents across numerous indices frequently.
6. Querying data from a frozen index
When using ILM, If you still need to query data from a frozen index, you can use the alias to do so. Instead of searching for data directly through that index, you may run a search query on the alias, which will increase performance and allow you to get a response with fewer resources.
Notes
- An Alias cannot be used for the indexing process if it points to more than one index. If attempted, Elasticsearch will throw an exception.
- Deleting an alias does not delete the actual index.
Common problems
When you try to index a document into an alias that points to more than one index, Elasticsearch returns an error because it doesn’t know which concrete index the document should be indexed to.
You will get the following error message:
{ "error" : { "root_cause" : [ { "type" : "illegal_argument_exception", "reason" : "no write index is defined for alias [my-alias]. The write index may be explicitly disabled using is_write_index=false or the alias points to multiple indices without one being designated as a write index" } ], "type" : "illegal_argument_exception", "reason" : "no write index is defined for alias [my-alias]. The write index may be explicitly disabled using is_write_index=false or the alias points to multiple indices without one being designated as a write index" }, "status" : 400 }
For a full troubleshooting guide on how to resolve this error, see here.
Log Context
Log “Invalid alias filter”classname is ShardSearchRequest.java We extracted the following from Elasticsearch source code for those seeking an in-depth context :
return null; } try { return filterParser.apply(alias.filter().uncompressed()); } catch (IOException ex) { throw new AliasFilterParsingException(index; alias.getAlias(); "Invalid alias filter"; ex); } }; if (aliasNames.length == 1) { AliasMetadata alias = aliases.get(aliasNames[0]); if (alias == null) {
See how you can use AutoOps to resolve issues