How To Solve Issues Related to Log – Id . _source

Prevent Your Next ELK Incident

Try our free Check Up to test if your ES issues are caused from misconfigured settings

Fix Issue

Updated: Feb-20

In Page Navigation (click to jump) :
Troubleshooting Background       
Related Issues  
Log Context
About Opster

Opster Offer’s World-Class Elasticsearch Expertise In One Powerful Product
Try Our Free ES Check-Up   Prevent Incident

Troubleshooting background

To troubleshoot Elasticsearch log “Id . _source” it’s important to understand common problems related to Elasticsearch concepts: aggregations. See detailed explanations below complete with common problems, examples and useful tips.

Source in Elasticsearch

What it is

When a document is sent to for indexing, Elasticsearch indexes all the fields in the format of inverted index but it also keeps the original json document in a special field called _source. 

Examples

Disabling source field in the index

PUT /api-logs?pretty
{
  "mappings": {
    "_source": {
      "enabled": false
    }
  }
}

Store only selected fields as a part of _source field

PUT api-logs
{
  "mappings": {
    "_source": {
      "includes": [
        "*.count",
        "error_info.*"
      ],
      "excludes": [
        "error_info.traceback_message"
      ]
    }
  }
}

Including only selected fields using source filtering

GET api-logs/_search
{
  "query": {
    "match_all": {}
  },
  "_source": {
       "includes": ["api_name","status_code", "*id"]
  }
}

Notes

The source field brings an overhead of extra storage space but serves special purposes such as:

  • Return as a part of the response when a search query is executed.
  • Used for reindexing purpose, update and update_by_query operations.
  • Used for highlighting, if the field is not stored, it means  the field is not set as “store to true” inside the mapping.
  • Allows selection of fields to be returned.

The only concern with source field is the extra storage usage on disk. But this storage space used by source field can be optimized by changing compression level to best_compression. This setting is done using index.codec parameter.


To help troubleshoot related issues we have gathered selected Q&A from the community and issues from Github , please review the following for further information :

1 How To Include The Id Field In The  

2Official Document Question  


Log Context

Log ”-> id [{}]; _source [{}]” classname is tophits-aggregation.asciidoc
We have extracted the following from Elasticsearch source code to get an in-depth context :

     logger.info("key [{}]; doc_count [{}]"; key; docCount);

    // We ask for top_hits for each bucket
    TopHits topHits = entry.getAggregations().get("top");
    for (SearchHit hit : topHits.getHits().getHits()) {
        logger.info(" -> id [{}]; _source [{}]"; hit.getId(); hit.getSourceAsString());
    }
}
--------------------------------------------------

This will basically produce for the first example:






About Opster

Incorporating deep knowledge and broad history of Elasticsearch issues. Opster’s solution identifies and predicts root causes of Elasticsearch problems, provides recommendations and can automatically perform various actions to manage, troubleshoot and prevent issues.

Learn more: Glossary | Blog| Troubleshooting guides | Error Repository

Need help with any Elasticsearch issue ? Contact Opster

Did this page help you?