Elasticsearch OpenSearch _source

By Opster Team

Updated: Aug 29, 2023

| 1 min read

Before you dig into the details of this technical guide, have you tried asking OpsGPT?

You'll receive concise answers that will help streamline your Elasticsearch/OpenSearch operations.


Try OpsGPT now for step-by-step guidance and tailored insights into your Elasticsearch/ OpenSearch operation.

Before you dig into the details of this guide, have you tried asking OpsGPT? You’ll receive concise answers that will help streamline your OpenSearch/Elasticsearch operation.

Try OpsGPT now for step-by-step guidance and tailored insights into your search operation.

To easily resolve issues in your deployment and locate their root cause, try AutoOps for OpenSearch. It diagnoses problems by analyzing hundreds of metrics collected by a lightweight agent and offers guidance for resolving them. Try AutoOps for free.

Overview

When a document is sent for indexing, OpenSearch indexes all the fields in the format of an inverted index, but it also keeps the original JSON document in a special field called _source. 

Examples

Disabling source field in the index:

PUT /api-logs?pretty
{
  "mappings": {
    "_source": {
      "enabled": false
    }
  }
}

Store only selected fields as a part of _source field:

PUT api-logs
{
  "mappings": {
    "_source": {
      "includes": [
        "*.count",
        "error_info.*"
      ],
      "excludes": [
        "error_info.traceback_message"
      ]
    }
  }
}

Including only selected fields using source filtering:

GET api-logs/_search
{
  "query": {
    "match_all": {}
  },
  "_source": {
       "includes": ["api_name","status_code", "*id"]
  }
}

Notes

The source field brings an overhead of extra storage space but serves special purposes such as:

  • Return as a part of the response when a search query is executed.
  • Used for reindexing purpose, update and update_by_query operations.
  • Used for highlighting, if the field is not stored, it means  the field is not set as “store to true” inside the mapping.
  • Allows selection of fields to be returned.

The only concern with source field is the extra storage usage on disk. But this storage space used by source field can be optimized by changing compression level to best_compression. This setting is done using index.codec parameter.

How helpful was this guide?

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?


Related log errors to this OS concept


Source shard routingEntry is not an active primary
Source shard is not marked yet as relocating to request targetNode
Source has canceled the recovery
Source shard is closed
Failed to execute cluster state update source
Failed to decompress source
Failed to convert source to a json string
Failed to parse load source
Error filtering source
Invalid source type
Value source of type this value is not supported by scripts
Failed to load settings from source

< Page: 1 of 3 >

Get expert answers on Elasticsearch/OpenSearch