Elasticsearch SLM in Elasticsearch vs Snapshot Management in OpenSearch

By Opster Expert Team - Gustavo

Updated: Jun 28, 2023

| 4 min read

Before you dig into the details of this technical guide, have you tried asking OpsGPT?

You'll receive concise answers that will help streamline your Elasticsearch/OpenSearch operations.


Try OpsGPT now for step-by-step guidance and tailored insights into your Elasticsearch/ OpenSearch operation.

Before you dig into the details of this guide, have you tried asking OpsGPT? You’ll receive concise answers that will help streamline your OpenSearch/Elasticsearch operation.

Try OpsGPT now for step-by-step guidance and tailored insights into your search operation.

To easily resolve issues in your deployment and locate their root cause, try AutoOps for OpenSearch. It diagnoses problems by analyzing hundreds of metrics collected by a lightweight agent and offers guidance for resolving them. Try AutoOps for free.

Quick links

Introduction

Snapshot Lifecycle Management (SLM) in Elasticsearch, and Snapshot Management (SM) in Opensearch are both features that fulfill the same purpose: handling the automatic generation and deletion of snapshots. 

Both SLM and SM can create, schedule, and delete snapshots based on elapsed time or number of snapshots taken.

The table below outlines shared and unique features. The differences are explained below.

SLM vs. SM comparison table – similarities and differences

FeatureElasticsearch Snapshot Lifecycle Management (SLM)OpenSearch Snapshot Management (SM)
InstallationNativePlugin
SchedulingYes (cron-like syntax)Yes (cron-like syntax)
Time LimitNoYes
Snapshot DeletionYes (retention rules)Yes (retention rules)
NotificationsNoYes
Partial SnapshotsYesYes
Ignore Unavailable IndicesYesYes
Global StateYesYes
Feature StatesYesNo
Policies InfoStats APIExplain API

Installation

While Elasticsearch snapshot lifecycle management is native to Elasticsearch, in Opensearch users need to install the Snapshot Management plugin to acquire this functionality.

Time limit

The time_limit parameter in OpenSearch ensures that the snapshot process is completed within a specified duration. If the time_limit is longer than the scheduled interval for taking snapshots, the system skips subsequent snapshots until the time_limit has passed.

For instance, if the time_limit is set to 35 minutes and snapshots are scheduled to be taken every 30 minutes starting at midnight, the system will capture the snapshots at 00:00 and 01:00. However, the snapshot scheduled for 00:30 will be skipped due to the active time_limit.

This feature is not available in any version of Elasticsearch.

Notifications

OpenSearch sends notifications if certain events occur: 

  • Snapshot created
  • Snapshot deleted
  • Creation or deletion failure
  • Snapshot operation takes more than the defined time_limit

This feature is not available in any version of Elasticsearch.

Feature states

Feature States are new additions from Elasticsearch to SLM.

Feature states allow users to handle system indices in a granular way, allowing them to take or restore snapshots from a subset of features. These features will be a mix of built in features, and those defined by plugins.

This could be useful because in order to improve the environment’s security, users may want to create a separate repository, which contains only certain feature states, such as security, excluding the cluster state from the main repository.

Run the following command to retrieve cluster features:

GET /_features

Example Output:

{
  "features": [
    {
      "name": "security",
      "description": "Manages configuration for Security features, such as users and roles"
    },
    {
      "name": "logstash_management",
      "description": "Enables Logstash Central Management pipeline storage"
    },
    {
      "name": "geoip",
      "description": "Manages data related to GeoIP database downloader"
    },
    {
      "name": "async_search",
      "description": "Manages results of async searches"
    },
    {
      "name": "fleet",
      "description": "Manages configuration for Fleet"
    },
    {
      "name": "enrich",
      "description": "Manages data related to Enrich policies"
    },
    {
      "name": "searchable_snapshots",
      "description": "Manages caches and configuration for searchable snapshots"
    },
    {
      "name": "tasks",
      "description": "Manages task results"
    },
    {
      "name": "machine_learning",
      "description": "Provides anomaly detection and forecasting functionality"
    },
    {
      "name": "transform",
      "description": "Manages configuration and state for transforms"
    },
    {
      "name": "watcher",
      "description": "Manages Watch definitions and state"
    },
    {
      "name": "kibana",
      "description": "Manages Kibana configuration and reports"
    }
  ]
}

In Elasticsearch users can include given feature states by listing them in API commands (or via kibana interface)

PUT _slm/policy/my-snapshots
{
  "schedule": "0 50 2 * * ?",
  "name": "<my-snapshot-{now/d}>",
  "repository": "my_repo",
  "config": {
    "indices": "*",
    "include_global_state": true,
    "feature_states": [
      "kibana",
      "security"
    ]
  },
  "retention": {
    "expire_after": "7d",
    "min_count": 5,
    "max_count": 10
  }
}

Policies info

SLM and SM each have their own processes to retrieve the current status of a Snapshot Policy, let’s review these below.

Elasticsearch (Stats API)

With the following command:

GET /_slm/stats

Users can get global stats about SLM policies and stats per policy:

{
  "retention_runs": 1649,
  "retention_failed": 0,
  "retention_timed_out": 0,
  "retention_deletion_time": "3.7h",
  "retention_deletion_time_millis": 13439803,
  "total_snapshots_taken": 1650,
  "total_snapshots_failed": 1,
  "total_snapshots_deleted": 1550,
  "total_snapshot_deletion_failures": 0,
  "policy_stats": [
    {
      "policy": "my-snapshot-policy",
      "snapshots_taken": 1650,
      "snapshots_failed": 1,
      "snapshots_deleted": 1550,
      "snapshot_deletion_failures": 0
    }
  ]
}

Get more information about the execution ny running: 

GET _slm/policy/cloud-snapshot-policy?human

{
  "my-snapshot-policy": {
    "version": 1,
    "modified_date": "2023-03-08T18:23:43.418Z",
    "modified_date_millis": 1678299823418,
    "policy": {
      "name": "<my-snapshot-{now/d}>",
      "schedule": "0 */30 * * * ?",
      "repository": "snapshots",
      "config": {
        "partial": true
      },
      "retention": {
        "expire_after": "259200s",
        "min_count": 10,
        "max_count": 100
      }
    },
    "last_success": {
      "snapshot_name": "my-snapshot-2023.04.12-7b63taketbumqiajdr8weg",
      "start_time_string": "2023-04-12T03:29:59.810Z",
      "start_time": 1681270199810,
      "time_string": "2023-04-12T03:30:12.310Z",
      "time": 1681270212310
    },
    "last_failure": {
      "snapshot_name": "my-snapshot-2023.03.25-enngoweiqbqo1rlosc5_bg",
      "time_string": "2023-03-25T03:00:12.051Z",
      "time": 1679713212051,
      "details": """{"type":"snapshot_exception","reason":"[snapshots:my-snapshot-2023.03.25-enngoweiqbqo1rlosc5_bg] failed to create snapshot successfully, 9 out of 95 total shards failed"}"""
    },
    "next_execution": "2023-04-12T04:00:00.000Z",
    "next_execution_millis": 1681272000000,
    "stats": {
      "policy": "my-snapshot-policy",
      "snapshots_taken": 1650,
      "snapshots_failed": 1,
      "snapshots_deleted": 1550,
      "snapshot_deletion_failures": 0
    }
  }
}

OpenSearch (Explain API)

OpenSearch (Explain API) flow.

OpenSearch exposes an explain API that focuses on the current state of the policy:

GET _plugins/_sm/policies/<policy_names>/_explain
{
  "policies" : [
    {
      "name" : "daily-policy",
      "creation" : {
        "current_state" : "CREATION_START",
        "trigger" : {
          "time" : 1656403200000
        }
      },
      "deletion" : {
        "current_state" : "DELETION_START",
        "trigger" : {
          "time" : 1656403200000
        }
      },
      "policy_seq_no" : 44696,
      "policy_primary_term" : 19,
      "enabled" : true
    }
  ]
}

Get more information by running: 

GET _plugins/_sm/policies/<policy_name>
{
  "_id" : "daily-policy-sm-policy",
  "_version" : 6,
  "_seq_no" : 44696,
  "_primary_term" : 19,
  "sm_policy" : {
    "name" : "daily-policy",
    "description" : "Daily snapshot policy",
    "schema_version" : 15,
    "creation" : {
      "schedule" : {
        "cron" : {
          "expression" : "0 8 * * *",
          "timezone" : "UTC"
        }
      },
      "time_limit" : "1h"
    },
    "deletion" : {
      "schedule" : {
        "cron" : {
          "expression" : "0 1 * * *",
          "timezone" : "America/Los_Angeles"
        }
      },
      "condition" : {
        "max_age" : "7d",
        "min_count" : 7,
        "max_count" : 21
      },
      "time_limit" : "1h"
    },
    "snapshot_config" : {
      "metadata" : {
        "any_key" : "any_value"
      },
      "ignore_unavailable" : "true",
      "include_global_state" : "false",
      "date_format" : "yyyy-MM-dd-HH:mm",
      "repository" : "s3-repo",
      "partial" : "true"
    },
    "schedule" : {
      "interval" : {
        "start_time" : 1656341042874,
        "period" : 1,
        "unit" : "Minutes"
      }
    },
    "enabled" : true,
    "last_updated_time" : 1656341042874,
    "enabled_time" : 1656341042874
  }
}

Conclusion

Both Snapshot Lifecycle Management (SLM) in Elasticsearch and Snapshot Management (SM) in OpenSearch serve the same primary purpose, automating the generation and deletion of snapshots. They share several common features, such as scheduling, snapshot deletion, global state, and ignoring unavailable indices. However, there are key differences between the two.

OpenSearch’s Snapshot Management offers a time limit feature, ensuring the snapshot process does not exceed a specified duration. This can help avoid conflicts with subsequent snapshots in cases where the snapshot process takes longer than the scheduled interval. 

Additionally, OpenSearch sends notifications for various snapshot events, such as creation, deletion, and failures, enhancing the user experience and increasing system status awareness.

On the other hand, Elasticsearch’s Snapshot Lifecycle Management offers Feature States, allowing for more granular control over system indices when taking or restoring snapshots. This enables users to manage specific features and their associated data in a more targeted way, more so than the global state option.

Both SLM and SM have the option to track the current state of a policy, OpenSearch focuses more on the current state, while Elasticsearch’s focus is on the snapshots generated.

How helpful was this guide?

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?


Get expert answers on Elasticsearch/OpenSearch