Elasticsearch Source

Elasticsearch Source

Last Update: March 2020

Before you start reading this page, try the new Elasticsearch Check-Up - Get actionable recommendations that can improve your cluster search and indexing speed (no installation required).

Source in Elasticsearch

What it is

When a document is sent to for indexing, Elasticsearch indexes all the fields in the format of inverted index but it also keeps the original json document in a special field called _source. 

Examples

Disabling source field in the index

PUT /api-logs?pretty
{
  "mappings": {
    "_source": {
      "enabled": false
    }
  }
}

Store only selected fields as a part of _source field

PUT api-logs
{
  "mappings": {
    "_source": {
      "includes": [
        "*.count",
        "error_info.*"
      ],
      "excludes": [
        "error_info.traceback_message"
      ]
    }
  }
}

Including only selected fields using source filtering

GET api-logs/_search
{
  "query": {
    "match_all": {}
  },
  "_source": {
       "includes": ["api_name","status_code", "*id"]
  }
}

Notes

The source field brings an overhead of extra storage space but serves special purposes such as:

  • Return as a part of the response when a search query is executed.
  • Used for reindexing purpose, update and update_by_query operations.
  • Used for highlighting, if the field is not stored, it means  the field is not set as “store to true” inside the mapping.
  • Allows selection of fields to be returned.

The only concern with source field is the extra storage usage on disk. But this storage space used by source field can be optimized by changing compression level to best_compression. This setting is done using index.codec parameter.


About Opster

Opster is redefining Elasticsearch management - pro-actively troubleshooting, optimizing performance, operating on clusters and assisting with all things needed to successfully run ES in production


Click below to learn how to fix common problems related to these concepts
« Back to Index