Cross-cluster search enables users to execute a query across multiple Elasticsearch or OpenSearch clusters. To perform cross cluster search...
To install OpenSearch on EC2, you will need a route-53 resource that connects to your VPC. The Terraform module will deploy 3 EBS, 5 EC2...
Helm is the best way to find, share, and use software built for Kubernetes. To install OpenSearch using Helm charts, you need to first...
OpenSearch supports migration to Elasticsearch using rolling upgrades. There're 4 methods to migrate data from Elasticsearch to OpenSearch...
If you’re using Elasticsearch version 7.11 or later, you cannot use rolling upgrades to migrate to OpenSearch. Instead, you can use these...
Installation instructions, advantages of using the Operator for OpenSearch management, tips and benefits of...
There are 3 types of OpenSearch alerts: system, logs, & business specific. Before creating an alert you need to set a channel. An example...
OpenSearch keeps the original JSON document in a field called _source. The source field serves special purposes such as...
The aggregations framework is a tool built in every OpenSearch deployment. The different aggregation types: Bucket, Metric & Pipeline...
An OpenSearch alias is a secondary name to refer to one or more indices. Aliases can be created and deleted dynamically using...
OpenSearch bulk makes it possible to perform many write operations in a single API call, which increases indexing speed. Using bulk API...
OpenSearch uses 3 types of caches to improve the efficiency of operation: node requests, shards and field data cache. To clear...
Circuit breaker exceptions are thrown to alert us that something needs to be fixed in OpenSearch in order to reduce memory usage. To fix...
OpenSearch has circuit breakers to deal with OutOfMemory errors that cause nodes to crash. To size a circuit breaker...
Official OpenSearch clients are available for java, Perl, PHP, python, ruby and .NET. The official clients follow a similar structure and...
An OpenSearch cluster is a group of servers (nodes) working together to store data & respond to requests. The key elements of clustering...
This article will compare OpenSearch Dashboards and Kibana, highlighting their similarities, differences, and...
DELETE is an OpenSearch API that removes a document from a specific index. It requires an index name & _id document to delete the document.
OpenSearch delete by query is an API, which provides functionality to delete all documents based on the matching query. If you don't...
To find out which functions have been deprecated in OpenSearch, you can use deprecation logs, deprecation API, read breaking pages...
OpenSearch discovery occurs when a node starts, restarts or loses contact with the master node. The discovery.seed_hosts...
OpenSearch uses several parameters to enable it to manage hard disk storage across the cluster, such as...
The easiest way to start testing OpenSearch is running the available docker image. To spin up an OpenSearch cluster using docker, you need...
Each OpenSearch document is a JSON structure, which is ultimately considered to be a series of key:value pairs. An example for creating...
In OpenSearch the term Fielddata is relevant when performing sorting and aggregations on text field. To set fielddata=true, you...
An OpenSearch filter applies conditions inside the query to narrow down the matching results. A filter clause can be used used in...
In OpenSearch, flush is the process of permanently storing data onto the disk for all of the operations that have been stored in memory.
The OpenSearch heap size is the amount of RAM allocated to the JVM of a node. When JVM performance is not optimal...
High CPU in OpenSearch is often a symptom of other underlying issues. It should be fixed since a distressed node will slow query response...
An OpenSearch index contains a schema and can have one or more shards and replicas. Here's how to create, delete, list, and query an index.
An OpenSearch index contains a schema and can have one or more shards and replicas. Here's how to create, delete, list & query Indices.
Indexing is the process of adding or updating new documents to an OpenSearch index. To index a document...
OpenSearch Lucene or Apache Lucene is an open-source Java library used as a search engine. OpenSearch is built on top of Lucene...
An OpenSearch mapping contains the properties of each field in the index. A common issue is an incorrectly defined mapping. To update...
If the max of shards per node is exceeded in OpenSearch, shards can't be allocated. To fix this, check to see whether the limit is at...
OpenSearch metadata refers to additional information stored for each document using metadata fields. Metadata fields can be customized...
There are different types of OpenSearch nodes. Each has its own role and purpose. Cluster-Manager, coordinating and data node roles differ...
Dashboards are the most useful tool to visualize data without having to code an entire framework that consumes data from the engine...
In OpenSearch, Persistent refers to cluster settings that persist across cluster restarts. Examples of persistent settings include...
Plugins in OpenSearch are used to extend the functionality of OpenSearch. To install and uninstall an OpenSearch plugin...
Queues in OpenSearch exist in the context of Thread Pools. Queues are used to hold the pending requests for thread pools. For example...
Cluster rebalancing is the process by which an OpenSearch cluster distributes data across the nodes. To force rebalance manually...
In OpenSearch, recovery refers to the process of recovering an index or shard when something goes wrong. The recovery API can be used by...
OpenSearch red status indicates not only that the primary shard has been lost, but also that a replica has not been promoted...
OpenSearch requires a refresh operation to make indexed information available for search. You can set an OpenSearch refresh_interval by...
OpenSearch reindex is the concept of copying existing data from a source index to a destination index. The reindex API is...
OpenSearch replication refers to storing a redundant copy of the data. Replicas are used to provide high availability and failover of...
An OpenSearch repository needs to be registered using the _snapshot endpoint. The supported repository types are: S3, HDFS, Azure...
In OpenSearch, restore refers to a snapshot restore mechanism. To restore a snapshot, an index, or selected indices...
In OpenSearch, routing refers to document routing. OpenSearch will determine which shard the document will be routed to for indexing when...
The OpenSearch scroll API is useful when a search returns a large set of results. Large search results are exhaustive for the system...
To search in OpenSearch, send a GET request to the _search endpoint in the search API. In the query phase and the fetch phase there are...
An OpenSearch cluster can start to reject search requests for several reasons. To resolve this, check the state of the thread pool and..
OpenSearch settings can be configured on the cluster-level, node-level and index-level. Here's how to set up and optimize your settings.
Each OpenSearch shard is an Apache Lucene index. The number of shards is set when an index is created, and cannot be changed without...
It is a best practice that OpenSearch shard size should not go above 50GB for a single shard. If you go above this limit...
An OpenSearch snapshot is a backup of an index taken from a running cluster. It's better to use snapshots instead of disk backups due...
An OpenSearch task is equivalent to an operation. OpenSearch provides a dedicated task API for the task management, which includes actions...
An OpenSearch template falls into one of these categories: index templates or search templates. To create a dynamic index template...
OpenSearch threadpools are used to manage how requests are processed and to optimize the use of resources. An example of...
An OpenSearch upgrade of an existing cluster can be done in 2 ways: through a rolling upgrade or a full cluster restart. To update...
A version corresponds to the OpenSearch built-in tracking system that tracks the changes in each document. By using _version...
An OpenSearch yellow status indicates that one or more of the replica shards on the cluster are not allocated to a node. This could occur...
An OpenSearch script can place heavy loads on clusters if it is not written carefully. It is a best practice to limit the type of..
Adaptive replica selection in OpenSearch is a process that prevents a distressed node from delaying the response to queries. To enable it...
OpenSearch cluster shard rebalancing and allocation are often confused with each other. If cluster shard rebalancing isn't enabled, then...
By default, OpenSearch expensive queries are allowed to run. By setting search.allow_expensive_queries to false, you can prevent users...
It's possible to reduce the risk of accidental deletion of indices by preventing OpenSearch wildcard use for destructive operations. To...
OpenSearch carries out "bootstrap checks" to ensure that important settings have been set correctly. Common issues with bootstrap checks...
OpenSearch can be configured to prevent memory swapping on its host machine by adding bootstrap memory_lock true. If bootstrap checks...
An OpenSearch read-only delete block can be applied automatically by the cluster because of a disk space issue. To resolve issues...
In this article, we will discuss some advanced tips and best practices for optimizing your OpenSearch Dashboards experience.
File descriptors are required to keep track of all the files OpenSearch has open at any given time, as well as all network...
Script regex is disabled in OpenSearch by default, but you can decide to enable it. Regex must be used with care in painless scripts...
If the ratio of memory to number of shards in the OpenSearch cluster is low, it suggests that you have insufficient memory compared to...
Finding the right number of shards for your OpenSearch indices, and the right size for each shard depends on many factors, including...
If you don’t have enough disk space available, OpenSearch will stop allocating shards to the node. To optimize your disk space...
When you have too many shards in your OpenSearch cluster, there are a few steps you can take in order to reduce the number of shards...
An OpenSearch cluster state includes metadata information about nodes, indices, etc. The main causes of having a large cluster state are...
Here are 12 tips to reduce and optimize your AWS OpenSearch costs. First, plan data retention: carefully adjust your...
There are various watermark thresholds on an OpenSearch cluster. As the disk fills up on a node, the 1st threshold to be crossed is...
When the “disk flood stage” threshold is exceeded on an OpenSearch cluster, it will start to block core actions. To resolve this issue...
High disk watermark is one of the various thresholds on your OpenSearch cluster. Passing this threshold is a warning and you should ...
When an OpenSearch cluster state becomes too large it poses many challenges. To determine the size of your cluster state and reduce it...
Sometimes you can observe that the CPU and load on some of your OpenSearch data nodes is higher than on others. To fix this, check the...
Low disk watermark is one of the various thresholds on your OpenSearch cluster. Here are possible actions you can take to resolve...
The OpenSearch process is very memory intensive. Here are the memory requirements and some tips to reduce your OpenSearch memory usage.
Oversharding in OpenSearch indicates that you have too many shards, and thus they are too small. To prevent and resolve this issue...
Shard allocation is an algorithm by which OpenSearch decides which unallocated shards should go on which nodes. To resolve unbalanced...
UltraWarm is an AWS OpenSearch service feature that provides a cost-effective way to store large amounts of time-based. To use UltraWarm...
With cold storage, OpenSearch provides an advanced & efficient storage solution that complements the existing UltraWarm feature. To use...
Elasticsearch ILM (Index Lifecycle Management) & OpenSearch ISM (Index State Management) have the same goal, but their execution differs...
Follow these steps to configure all OpenSearch node role types (master, data, coordinating, ingest, machine learning, remote eligible...
Mappings are the core element of index creation in OpenSearch. Defining them correctly can improve performance. Mapping types include...
Ingest pipelines sit within the OpenSearch node and will perform a set of alterations on your data that you...
The join data type field allows users to establish parent-child relationships between documents in OpenSearch. To use join, you need to...
OpenSearch has many methods for defining relationships between documents, such as nested documents. An OpenSearch nested query...
OpenSearch object types can be used to define relationships between documents. Here's how to use the object field type for that purpose.
When you have too many shards in your OpenSearch cluster, there are a few steps you can take in order to reduce the number of shards...
OpenSearch index templates allow us to create indices with user defined configuration. An index can pull the configuration from these...
Nested is a special object type that is indexed as a separate document. To demonstrate the use of OpenSearch nested fields VS. object...
Elasticsearch & OpenSearch offer ways to save costs by putting older data into cheaper machines. OpenSearch uses UltraWarm and...
This guide will go over the OpenSearch Cross Cluster Search (CCS) & Cross Cluster Replication (CCR) features, how to configure CCR and more.
In this article, we will delve into the different data types supported by OpenSearch and how to use them effectively.
The 2 methods in OpenSsearch to calculate the storage size of specific fields in an index are: creating dedicated indices & using the Luke....
Terms aggregations rely on an internal data structure known as global ordinals. The eager_global_ordinals parameter is used to...
The new match_only_text feature in OpenSearch can save up to 10% of disk space on logging datasets. This field type will set a flat...
Remote-backed storage is an experimental OpenSearch feature. Here's how to enable it, recover data from remote repositories & its limitations.
OpenSearch searchable snapshots allows to search snapshots in remote repositories without pre-downloading all index data to disks. To use...
The OpenSearch segment replication feature copies segments directly to the replica nodes disk after refresh. The architecture design...
There are various methods for retrieving fields in OpenSearch, including: _source, stored_fields, fields & docvalue_fields. To retrieve...
By using the Split Index API in OpenSearch, an existing index can be split to create a new index with extra primary shards. To do this...
The text analysis process in OpenSearch is tasked with two functions: tokenization & normalization and is carried out by employing analyzers.
A tokenizer decides how OpenSearch will take a set of words and divide it into separated terms called “tokens”. To work with synonyms...
OpenSearch transforms allow users to generate new indices based on existing data aggregations. Here's to create index transforms.
OpenSearch offers an easy way to configure a hot-warm architecture under specific conditions. To set up a hot-warm architecture for ISM...
Here are the similarities and differences between Elasticsearch Snapshot Lifecycle Management (SLM) and OpenSearch Snapshot Management (SM).
There are at least three use cases where you should consider using transforms instead of aggregations in OpenSearch. First, when the...
The node concurrent recoveries setting in OpenSearch determines the max number of shards that can be recovered at once from each node. To...
Coordinating nodes differ to ingest nodes. An ingest node is used for pre-processing documents in ingest pipelines. On the contrary...
An OpenSearch coordinating node handles HTTP(S) requests for the cluster, especially indexing & search requests. A coordinating only...
When there is indexing downtime in OpenSearch, troubleshooting is needed. To resolve this incident you need to...
Setting up zone awareness for shard allocation in OpenSearch ensures high availability in the case many servers go down. Here's how to...
Analyzing search slow logs in OpenSearch can provide users insights like the number of costly queries, reasons why queries were costly, so...
There are multiple ways to improve your OpenSearch aggregations performance. First, you should limit the scope by filtering documents...
In this guide, we will detail how to increase OpenSearch speed by optimizing query and OpenSearch settings.
One of the most difficult issues to manage and resolve in OpenSearch is poor search performance. Here's how to optimize search performance.
Here's an overview of the different methods to paginate documents in OpenSearch and how to paginate with Point in Time (PIT).
Follow these steps to list and restore dangling indices in OpenSearch: (1) Run the dangling indices API & copy the...
The OpenSearch async search API retrieves many data in a stream fashion instead of a single request. To limit the maximum response size...
There are many approaches for autocomplete in OpenSearch: index time, query time, completion suggester & search as you type. To choose...
OpenSearch provides 3 different techniques for fetching many results: Pagination, Search After & Scroll. THE PIT API can extend pagination...
Rollup jobs in OpenSearch reduce old data storage costs by storing summaries of data for a given time period. Rollup examples include...
This guide explores how to reduce OpenSearch search latency based on a key study. OpenSearch latency can be...
OpenSearch offers three types of suggesters: term suggesters, phrase suggesters & completion suggesters (autocomplete). Suggesters work...
There are various methods for retrieving fields in OpenSearch, including: _source, stored_fields, fields & docvalue_fields. To retrieve...
In this article, we will discuss the process of renaming an index in OpenSearch, the considerations to keep...
Here's how to generate reports in OpenSearch by using OpenSearch Dashboards and the CLI Reporting Feature. First, log into...
This guide explains the basics of embedding vectors and how vector search works under the hood in Elasticsearch & OpenSearch.
Anomaly detection is a feature in OpenSearch that captures unusual patterns in time series data. Here's how to set it up, with examples.
This guide will walk you through setting up vector search in OpenSearch using the k-NN plugin and the Neural Search plugin.
OpenSearch data streams enforce a setup that works well with time-based data, making the ISM policies easier to configure. To create...
Here's how to craft powerful OpenSearch hybrid search queries, including examples. The new hybrid search query and normalization-processor...
In OpenSearch, kNN stands for k-nearest neighbors & is used to find nearby documents based on vector dimensions. The kNN OpenSearch plugin...
In this guide, we'll discuss how to check the OpenSearch version, which is essential for ensuring compatibility. The version command line...
Cluster manager task throttling in an OpenSearch feature that allows users to mitigate the risk of task overflow. Here's how to use it.
Heavy merges in OpenSearch use CPU, memory and disk resources, which can slow down the cluster’s response speed. In order to fix...
OpenSearch cluster pending tasks are updates to the cluster state that were initiated by a user or the cluster. To resolve, list the...
Follow these steps to configure all OpenSearch node role types (master, data, coordinating, ingest, machine learning, remote eligible...
When facing recurring red status events in OpenSearch, AutoOps can be used to debug the issue. Many pending tasks in the cluster were...
There are 2 methods to increase the primary shard count in OpenSearch: _reindex API & the _split API. Before using either method, you…
Here's how to configure an OpenSearch snapshot repository for Amazon S3, Azure Blob Storage & Google Cloud Storage (GCS). To register...
This guide delves into the details of implementing AWS OpenSearch backup. There are 2 types of backups: automated & manual snapshots. To...
Once an indexing queue exceeds the maximum size, the OpenSearch node will start rejecting index requests. To resolve this, check the...
An overloaded cluster manager in OpenSearch may cause instability in the cluster. There are 3 ways to fix this: (1) Checking for...
An OpenSearch node can disconnect from a cluster for several reasons, including: excessive garbage collection from JVM, configuration...
When you try to retrieve a document by ID, OpenSearch will count the number of times that it searches for an ID which doesn't exist...
The cluster concurrent rebalance setting in OpenSearch determines the max number of shards which the cluster can move to rebalance...
An OpenSearch cluster requires a cluster manager to be identified in the cluster. Reasons why a cluster-manager is not discovered yet...
OpenSearch document-level alerting detects activities at the moment a document is indexed. To use this feature, you first need to...
When OpenSearch detects that the merge process cannot keep up with the rate of indexing, then it will start to throttle indexing...
"Hotspots" in OpenSearch refer to a situation when some nodes are handling greater load than others. To resolve hotspots...
OpenSearch loaded client nodes could cause an increase in search or indexing response latency. To resolve...
Sometimes you can observe that the CPU and load on some of your OpenSearch data nodes is higher than on others. To fix this, check the...
One way to evaluate whether your OpenSearch resources are cost-efficient its check the ratio of disk usage to the memory allocated...
In this article, we'll dive deep into the OpenSearch Performance Analyzer, discuss its architecture and share best practices for using it.
This guide discusses the advanced usage of OpenSearch Point in Time (PIT) & shares best practices for optimizing its performance.
By executing OpenSearch rolling restarts with the help of the API, you can maintain high cluster availability & avoid downtime. To do..
By using the Split Index API in OpenSearch, an existing index can be split to create a new index with extra primary shards. To do this...
In this guide, we will discuss how to use the OpenSearch-Py library to perform bulk operations and provide tips for optimizing performance.
There are a number of reasons why a search request can be rejected by the OpenSearch cluster. 400 - rejected by OpenSearch can be..
There are a number of possible causes for slow searches on particular OpenSearch nodes. To fix the issue, you should...
Shard-level & search backpressure are OpenSearch features that seek to improve cluster performance by selectively rejecting requests when...
If the indexing queue is high/causes timeouts, it hints that OpenSearch nodes can't keep up with the indexing rate. To fix slow indexing...
In OpenSearch, the combined_fields query allows you to search several text fields as though their indexed values have been indexed into...
In this article, we will discuss how to optimize fuzzy search in OpenSearch to improve search performance and accuracy.
The OpenSearch async search API retrieves many data in a stream fashion instead of a single request. To limit the maximum response size...
There are 4 types of OpenSearch boolean clauses: filter, must, should & must_not. A single bool query can contain a mix of them. To use...
OpenSearch boosting query is used to return only documents that match a positive query while minimizing the score of documents that...
An OpenSearch composite aggregation allows to paginate every bucket from a multi-level aggregation effectively. An example of....
In OpenSearch, the constant score query wraps other queries by executing them in a filter context. To implement constant_score query...
OpenSearch delete by query is an API, which provides functionality to delete all documents based on the matching query. If you don't...
The OpenSearch exists query is used for returning the documents that have an indexed value for a specific field, which means it returns the...
Here's how to craft powerful OpenSearch hybrid search queries, including examples. The new hybrid search query and normalization-processor...
OpenSearch Intervals query provides control over the words & their positions in a text that is required for a document to match a...
Match, Multi-Match & Match Phrase are all types of OpenSearch queries, used to search for matching documents in an index. To use them...
OpenSearch named queries allow you to label your queries with a name. Named queries can be utilized in a variety of use cases such as...
OpenSearch runtime fields with a type of lookup can retrieve field values from the associated indices. To implement runtime fields...
OpenSearch allows you to query data using 3 query languages: DSL, SQL & PPL. This guide covers how to prepare the data, use query tools...
There are several reasons for an OpenSearch slow query. Slow logs can be used to detect & troubleshoot slow queries issues...
In this article, we will discuss how to optimize vector search in OpenSearch, a community-driven, open-source...
In OpenSearch, the Terms enum API looks for similarities in the index based on partial matches. To use the terms_enum API...
In this guide, we'll discuss the process of changing the admin password in OpenSearch. Before changing the admin password, ensure that...
By using Single Sign-On (SSO) in OpenSearch, users can log into many apps with the same credentials. To set SOO using Azure AD as idP..
Single Sign-On (SSO) in OpenSearch allows users to have the same users & permissions across applications. To set up SAML SSO using Okta...
OpenSearch Keystore is a secure method for storing sensitive data. This guide explains how to use and manage the OpenSearch Keystore.
For security reasons, it's key to enable audit logs in OpenSearch. Here's how to configure audit logs & create a dashboard for visualization.
This article will discuss the default username and password for OpenSearch, how to change them, and how to secure your cluster by...
In OpenSearch, Active Directory (AD) via Lightweight Directory Access Protocol (LDAP) can be used for authentication. To configure it, use...
By setting up access control in OpenSearch, you can ensure that each user will be able to access what they need while securing other data...
To prepare an OpenSearch cluster for production, you need to first configure the certificates for security. Opensearch.yml is used...
PKI (Private Key Infrastructure) is a set of actors & procedures to manage digital certificates. To setup PKI authentication in OpenSearch...
With Sign-On (SSO), users to log into many applications with the same credentials. To set SOO using OpenID Connect (OIDC) in OpenSearch...
Users may encounter an issue where OpenSearch security is not initialized. Here are the causes of this issue and resolution methods.