Amazon OpenSearch Service Overview: Expectations VS. Reality

Amazon OpenSearch Service Overview: Expectations VS. Reality

Amazon OpenSearch Service vs unmanaged OpenSearch – a complete overview examining available features, capabilities, costs, and limitations.

Last Updated: May 2022

Running search operations can be extremely difficult and many companies struggle to maintain their deployments. To achieve the best results at the lowest possible cost, companies need to efficiently manage both the infrastructure and data layers of operations.

The infrastructure layer of operation includes: Elasticsearch/OpenSearch orchestration, deploying clusters, scaling resources, configuring security, provisioning and more.

The data layer includes: taking care of ingesting data, configuring and maintaining data structure, optimizing shards, avoiding latency, preventing incidents, improving performance and more.

Like many operations running in the cloud, when operating OpenSearch, you can choose to take the managed or unmanaged route. With an unmanaged approach, the customer takes full responsibility for everything, both the infrastructure and data layers (applications and services).

Alternatively, you can employ the Amazon OpenSearch Service, which manages most of the infrastructure tasks for you, thus facilitating the deployment, operation, and scaling of OpenSearch clusters in the AWS cloud.

In this blo­­­­g post, we’ll explore the managed vs unmanaged route by examining available features, capabilities, costs, and limitations. We’ll also have a look at the expectations users have of the service, versus the reality they meet.

What does Amazon OpenSearch Service cover?

AWS offers an OpenSearch managed service to easily deploy and manage OpenSearch infrastructure.

AWS makes it clear that they do not operate or configure applications for their customers – the focus is on managing AWS infrastructure more efficiently and securely. While AWS can assist security and some capacity optimization, they do not support the data layer.  

Image from aws.amazon.com
Image from aws.amazon.com

Setup and configuration

Responsibilities of the unmanaged route

Those who choose to manage OpenSearch on their own are entirely responsible for setting things up from A to Z, including network infrastructure, OS, disks, JVM, orchestration tools, backup and restore procedures, and security. These tasks can be quite complex, requiring a great deal of time and expertise.

Setting up all the necessary tasks and services involves tinkering with many settings, such as memory limits and port mapping. If the settings aren’t tuned correctly, this could lead to performance issues. For instance, if you don’t set up cleanup procedures for your logs and regularly monitor your memory usage, the disk space may run out and cause an outage. Other settings, such as dynamically assigning a host port to multiple containers, can harm mission-critical parameters like high availability.

Using Amazon OpenSearch Managed Service

One of the greatest advantages of Amazon OpenSearch Service is that it renders setup and configuration easy and immediate. You can easily deploy OpenSearch, selecting the desired number of instances, instance types, and storage options. Once selected, the service does the rest—setting up the domain, provisioning infrastructure capacity, and installing OpenSearch software.

Once the cluster is up and running, Amazon OpenSearch Service fully manages resources and performance through automated administrative tasks, including hardware provisioning, automatic daily backups, cluster recovery after failure, and version upgrades. Managing resources is simple—with straightforward drop-down menus for adjusting instance size and other parameters. 

Amazon OpenSearch Service monitors, visualizes, and analyzes certain key metrics in real-time. However, alerts and events must be built from scratch or set up through CloudWatch.

Setup and configuration drawbacks 

Although this service integrates seamlessly with other AWS services, it only supports the following set of plugins. Some of the missing plugins are vital for expanding OpenSearch capabilities.

Cost considerations

It’s tricky to compare the overall costs of an unmanaged and managed OpenSearch deployment. Cost is a function of many different factors, for example, the amount of data, backup needs, transfer costs, and the ability to plan ahead, as well as time and effort.

Managed costs

Although the Amazon OpenSearch Service does not require upfront fees or minimum usage requirements, its costs can become quite high. As operations begin to scale, the managed service can become particularly costly. At this stage, you may find yourself in a kind of “vendor lock-in,” making it difficult to transition back to an unmanaged environment. 

Moreover, while automated snapshots are stored in Amazon S3 for free, any additional manual snapshots will be stored and subject to Amazon S3 pricing. Yet another thing to take into account is that if you can plan ahead, choosing Reserved Instances in advance, as opposed to the on-demand option, can lower costs significantly.

Unmanaged expenses

At face value, the unmanaged option is far cheaper, costing roughly 30-40%—sometimes even up to 50%—less than the managed option. Without the OpenSearch service however, you’ll need to find a way to manage the infrastructure on your own, handling provisioning, monitoring, and finding adequate tools for observability and troubleshooting. If your team doesn’t have the resources to set things up correctly, optimize operations, and solve issues quickly, costs could pile up as well. 

What about the data?

Another factor influencing costs is the amount of data. If you have less than 5 GB of data, the service may be quite expensive, especially if you have multiple small clusters for isolation purposes. For each cluster, OS requires three dedicated master nodes to ensure a stable cluster. As a result, storing documents below 5 GB may not be worth the price. You get more bang for your buck if you have over 1 TB of data.

Monitoring, system optimization, and maintenance

Monitoring is a challenging and important issue in both the managed and unmanaged scenarios. Let’s take a look. 

On the managed side, first off, Amazon OpenSearch Service has limited access to administrative APIs, logs, and metrics. Although it provides aggregate metrics at the cluster level, it lacks some important node-level metrics and query logs. This is incredibly limiting when it comes to troubleshooting and pinpointing the root cause of any issues that arise.

When monitoring OpenSearch on your own, you may have more freedom and greater access to system metrics and logs, but you’ll still be tasked with finding a good monitoring tool and knowing how to make the best use of it. Either way, system visibility is critical for the next step—troubleshooting and performance optimization.

An important element of managing OpenSearch properly is the ability to adjust configurations in order to optimize performance. This points to a disadvantage in the AWS service, which only supports a limited set of operations and configuration changes. Among the functions missing are altering important performance factors (e.g., thread pool and query cache sizes) and basic functions like reindexing from a remote cluster (via reindex.remote.whitelist).

Finally, the OpenSearch Service takes care of updates for you, allowing you to conveniently track progress without having to get involved. However, a significant drawback is AWS’s use of blue-green deployments to do so. In this method, all the data of the cluster is copied to the new nodes, after which the old cluster is destroyed. This process may take days, exceed the cluster’s capacity, or even cause it to crash mid-operation. As a result, upgrades, rollouts, maintenance, and even the tiniest update can become time-consuming and expensive in large deployments. 

When managing OpenSearch on your own, these operations are achieved through a rolling restart. This is more complex, but on the whole, takes much less time and is more efficient.

Mission-Critical operations: The case of downtime 

A built-in disadvantage of managed services is that it increases your dependence on external support teams. When the production environment goes down, you’re completely dependent on OS support, and it may take days until the issue is resolved.

Another problem with the OS service that increases the risk of downtime is that it does not support shard rebalancing. Thus, if a single node runs out of space, the whole cluster will stop ingesting data. Businesses with mission-critical operations should take these limitations into account, and if they have decided to go with the managed option they might be better off with purchasing premium AWS support.

It’s important to note that dealing with downtime without a managed service is no less challenging. And there are plenty of examples that demonstrate the difficulty of dealing with downtime on your own. 

Managed Services – Expectations VS. Reality

Users often start out self-managed before moving to managed services. The reality of managing ES/OS is usually not what they expected. 

Managed services structure their business in such a way that they do not take care of the application layer. They don’t provide proactive support, and they take no responsibility for the way you configure your data. This last part makes perfect sense, seeing as it would be out of scope for a managed service to be responsible for the data you want to put in your system. Their agenda is not to intervene with the way you decide to structure, configure and manage your data.

Self-Managed – Expectations
Self-Managed – Reality
Ensure that mission-critical applications are running at peak performance and stable
Generic monitoring tools are not enough to operate successfully
Keep costs low – by avoiding payments to outside providers for support and services
Skilled DevOps engineers are hard to find. They then need to develop & maintain internal expertise tooling for multiple technologies. Hardware & employee costs pile up
Ensure that mission-critical applications are running at peak performance and stable
Keep costs low – by avoiding payments to outside providers for support and services
Ensure that mission-critical applications are running at peak performance and stable
Keep costs low – by avoiding payments to outside providers for support and services

Then again, users often have unrealistic expectations of what they’ll get from managed services as well.

Managed Services – Expectations
Managed Services – Reality
Easy to start with & operate
Stability & performance are the responsibility of internal teams
Move ownership to the managed service and enjoy peak performance for the application
Support is limited to infrastructure and doesn’t cover application needs
Cost efficiency
Costs continually increase when trying to solve performance by adding hardware
Features added by the managed service will add great value
Most features go unused, as they are irrelevant to many use casesMost features go unused, as they are irrelevant to many use cases
Easy to start with & operate
Move ownership to the managed service and enjoy peak performance for the application
Cost efficiency
Features added by the managed service will add great value
Easy to start with & operate
Move ownership to the managed service and enjoy peak performance for the application
Cost efficiency
Features added by the managed service will add great value

AWS Opensearch support can be helpful, as they will try to help with other sections from the application layer. However, this type of support is limited to “break-fix”, meaning they will only help you in a crisis when something isn’t working properly, as opposed to actively trying to optimize and improve.

The majority of managed services work with a ticket-based support system, which often entails long wait times and an inability to keep up with the dynamic demands of the system, not to mention being reliant on them when there are production crises. 

What does Opster offer?

What Opster offers

Opster’s products and support services cover both the infrastructure layer and the data layer of managing Elasticsearch & OpenSearch, with a focus on the data layer that users have always been left alone to handle. Opster can be used either alone, while running self-hosted, or in addition to a managed service like AWS to cover the gaps in service and optimize performance.

  • Automatic incident resolution and prevention.
  • Support powered by the AutoOps platform.
  • Optimized and improved performance.
  • Resource utilization and reduction of infrastructure costs.

Opster’s AutoOps detects issues, provides automatic resolution paths and optimizes resource utilization. The technology and tools perform automatic optimizations with the AutoOps operator, an on-prem service, which also enables Opster’s support team to interact with the system in real-time. 

Summary

While the managed service unburdens users from many complex and time-consuming tasks, it is generally more expensive, has limitations, and is not a foolproof solution for operational issues or even failure. Amazon OpenSearch Service might be a good fit for you as a standalone service to run your search operation, especially if your deployment is not mission-critical. However, if that’s not the case, you’ll likely find that a significant part of running your search falls on you and can take a heavy toll on your search performance.

sized operations that do not have mission-critical applications or with insufficient knowledge of OpenSearch to handle things on their own. However, if you have more than 1 TB of data, you may find the OS service lacks the flexibility needed to properly manage the cluster. Opting to manage everything on your own gives you greater control, but it also requires that you handle the inner workings of OpenSearch, including implementing observability and troubleshooting tools and being capable of watching out for pitfalls. 

No matter how you’re hosting your Elasticsearch or OpenSearch, you can benefit from Opster’s complete solutions. With an Opster AutoOps subscription, your database administration will be taken care of from start to finish. You’ll benefit from complete resolution of issues in your infrastructure & data layers, end-to-end support and constant optimization of your clusters. To learn more about Opster AutoOps, click here.