Elasticsearch Named Queries in Elasticsearch

Average Read Time

2 Mins

Elasticsearch Named Queries in Elasticsearch

Opster Expert Team - Saskia

Dec-2021

Average Read Time

2 Mins

Opster Team

October 2021

Average Read Time

2 Mins


In addition to reading this guide, we recommend you run the Elasticsearch Health Check-Up. It will detect issues and improve your Elasticsearch performance by analyzing your shard sizes, threadpools, memory, snapshots, disk watermarks and more.

The Elasticsearch Check-Up is free and requires no installation.

In addition to reading this guide, we recommend you run the Elasticsearch Configuration Check-Up. The Check-Up will help you check and optimize important settings in Elasticsearch to improve performance.

What are named queries?

Elasticsearch has a feature that might not be very well known but is very useful for various purposes. 

As the name implies, named queries is a feature that allows you to label your queries with a name. In most search applications you have more than 1 query in your query template. This can be different templates or a query template using multiple sub-queries, each matching a specific search requirement.

When you add a name to each of these low level queries, Elasticsearch will return a list of all matched queries in the response with each hit. 

This can be utilized in a variety of use cases:

Use case 1 – query debugging

Query debugging can be simplified a lot when using named queries. Did you ever have to use the _explain feature? The explain feature tells you exactly how the score was computed. This can be useful but often it is too detailed and very hard to read. 

You will often be asked why a hit was ranked higher than another. Very often it is because a query clause did not match a document at all. To get further insights into that and refine and enhance your queries over time, you can use named queries. 

Use case 2 – specific query logic

Named queries can be useful if you need a very specific query logic that is implemented in your backend. 

In E-Commerce you often want to display just 1 result and redirect to a detail page, for example if the query matched a product ID. 

With named queries it is very easy to implement this feature in an elegant way. You don’t have to run 2 parallel queries, you don’t have to parse the query results and apply any logic that Elasticsearch has already done for you. You just check if any of your hits was an exact match in product ID by the name and then display it. 

Example

Query – provide a name with each query clause:

GET _search
{
   "query":{
      "bool":{
         "should":[
            {
               "multi_match":{
                  "query":"123",
                  "fields":[
                     "title",
                     "description"
                  ],
                  "_name":"match content"
               }
            },
            {
               "match":{
                  "product_id":{
                     "query":"123",
                     "_name":"match product id"
                  }
               }
            }
         ]
      }
   }
}

Response – just use the content of “matched_queries” for further processing in your backend application:

{
   "hits":{
      "total":1,
      "max_score":0.2876821,
      "hits":[
         {
            "_index":"test",
            "_type":"doc",
            "_id":"1",
            "_score":0.2876821,
            "_source":{
               "product_id":"123"
            },
            "matched_queries":[
               "match product id"
            ]
         }
      ]
   }
}

Use case 3 – diversifying search results

Another typical use case is diversifying your search results. You may want to group your hits by category and only show the best hit per category. This can easily be done by grouping the results by a specific query name in the backend and just displaying the top picks per category. 

Use case 4 – logging

Another important use case for named queries is logging. If you are logging all your queries, reading the whole JSON source can be too much and not readable. 

If instead you only log the query terms and the matched queries along with some performance metrics, it is a lot easier to read and make sense of it. 

Named query example

Let’s see an example:

First let’s index 2 sample documents. Then let’s run a bool query on that index. The bool query contains 2 “should” clauses, each tagged with a name. In the result, we’ll be able to see which of the query clauses matched. 

PUT named_queries/_doc/1
{
  "title" : "Cats",
  "content" : "Cats are cute"
}

PUT named_queries/_doc/2
{
  "title" : "Dogs",
  "content" : "Dogs are cute"
}

GET named_queries/_search
{
   "query":{
      "bool":{
         "should":[
            {
               "match":{
                  "title":{
                     "query":"cats",
                     "_name":"title match"
                  }
               }
            },
            {
               "match":{
                  "content":{
                     "query":"cats",
                     "_name":"content match"
                  }
               }
            }
         ]
      }
   }
}

In the query result we see each hit with an additional parameter called “matched_queries”. This contains all query clauses that were matched for this document:

{
   "hits":{
      "total":{
         "value":1,
         "relation":"eq"
      },
      "max_score":0.5753642,
      "hits":[
         {
            "_index":"named_queries",
            "_type":"_doc",
            "_id":"1",
            "_score":0.5753642,
            "_source":{
               "title":"Cats",
               "content":"Cats are cute"
            },
            "matched_queries":[
               "title match",
               "content match"
            ]
         }
      ]
   }
}


Run the Check-Up to get a customized report like this:

Analyze your cluster