Elasticsearch Elasticsearch Query: Optimizing Query Performance

By Opster Team

Updated: Jul 25, 2023

| 2 min read

Introduction

Efficient querying is crucial for maintaining high performance in Elasticsearch clusters. In this article, we will discuss advanced techniques to optimize Elasticsearch query performance, including using filters, query rewriting, and caching. If you want to learn more about Elasticsearch search, check out this guide.

1. Use Filters for Non-Scoring Queries

When you don’t need to calculate a relevance score for your query results, use filters instead of match type of queries. Filters are faster because they don’t perform scoring calculations and are generallycached for better performance. For example, use the `bool` query with `filter` clause:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
GET /_search
{
"query": {
"bool": {
"filter": [
{ "term": { "status": "published" }},
{ "range": { "publish_date": { "gte": "now-1d" }}}
]
}
}
}
GET /_search { "query": { "bool": { "filter": [ { "term": { "status": "published" }}, { "range": { "publish_date": { "gte": "now-1d" }}} ] } } }
GET /_search
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "status": "published" }},
        { "range": { "publish_date": { "gte": "now-1d" }}}
      ]
    }
  }
}

2. Rewrite Queries for Better Performance

Some queries can be rewritten to improve performance without changing the results. For example, the `match_phrase` query can be slower than a `span_near` query with the same parameters. Replace the `match_phrase` query with a `span_near` query:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
GET /_search
{
"query": {
"span_near": {
"clauses": [
{ "span_term": { "content": "quick" }},
{ "span_term": { "content": "brown" }},
{ "span_term": { "content": "fox" }}
],
"slop": 0,
"in_order": true
}
}
}
GET /_search { "query": { "span_near": { "clauses": [ { "span_term": { "content": "quick" }}, { "span_term": { "content": "brown" }}, { "span_term": { "content": "fox" }} ], "slop": 0, "in_order": true } } }
GET /_search
{
  "query": {
    "span_near": {
      "clauses": [
        { "span_term": { "content": "quick" }},
        { "span_term": { "content": "brown" }},
        { "span_term": { "content": "fox" }}
      ],
      "slop": 0,
      "in_order": true
    }
  }
}

3. Use the `search_after` Parameter for Pagination

When paginating through large result sets, avoid using the `from` and `size` parameters, as they can cause performance issues. Instead, use the `search_after` parameter to paginate more efficiently:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
GET /_search
{
"size": 10,
"query": { "match_all": {} },
"sort": [
{ "date": "asc" },
{ "_id": "asc" }
],
"search_after": ["2022-01-01T00:00:00", "doc_id"]
}
GET /_search { "size": 10, "query": { "match_all": {} }, "sort": [ { "date": "asc" }, { "_id": "asc" } ], "search_after": ["2022-01-01T00:00:00", "doc_id"] }
GET /_search
{
  "size": 10,
  "query": { "match_all": {} },
  "sort": [
    { "date": "asc" },
    { "_id": "asc" }
  ],
  "search_after": ["2022-01-01T00:00:00", "doc_id"]
}

4. Leverage the Query Cache

Caching is enabled by default, so forcing cache will only make a difference if the cache is disabled at the index level, for more information follow this link: https://stackoverflow.com/a/63828533/3112848

To take advantage of this feature, use the `request_cache` parameter to force caching for specific requests:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
GET /_search?request_cache=true
{
"query": {
"bool": {
"filter": [
{ "term": { "status": "published" }},
{ "range": { "publish_date": { "gte": "now-1d" }}}
]
}
}
}
GET /_search?request_cache=true { "query": { "bool": { "filter": [ { "term": { "status": "published" }}, { "range": { "publish_date": { "gte": "now-1d" }}} ] } } }
GET /_search?request_cache=true
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "status": "published" }},
        { "range": { "publish_date": { "gte": "now-1d" }}}
      ]
    }
  }
}

5. Use the `profile` API to Identify Slow Queries

The `profile` API can help you identify slow queries and understand their performance characteristics. Add the `profile` parameter to your search request to get detailed profiling information:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
GET /_search
{
"profile": true,
"query": {
"match": {
"title": "elasticsearch"
}
}
}
GET /_search { "profile": true, "query": { "match": { "title": "elasticsearch" } } }
GET /_search
{
  "profile": true,
  "query": {
    "match": {
      "title": "elasticsearch"
    }
  }
}

Conclusion

Optimizing Elasticsearch query performance is essential for maintaining high-performance clusters. By using filters, rewriting queries, leveraging caching, and utilizing the `profile` API, you can significantly improve the efficiency of your Elasticsearch queries.


Opster
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.