Saturday, July 26, 2025

Amazon OpenSearch Service 101: What number of shards do I would like

Prospects new to Amazon OpenSearch Service typically ask what number of shards their indexes want. An index is a group of shards, and an index’s shard rely can have an effect on each indexing and search request effectivity. OpenSearch Service can absorb massive quantities of knowledge, break up it into smaller items referred to as shards, and distribute these shards throughout a dynamically altering set of cases.

On this publish, we offer some sensible steering for figuring out the perfect shard rely on your use case.

Shards overview

A search engine has two jobs: create an index from a set of paperwork, and search that index to compute the best-matching paperwork. In case your index is sufficiently small, a single partition on a single machine can retailer that index. For bigger doc units, in instances the place a single machine isn’t massive sufficient to carry the index, or in instances the place a single machine can’t compute your search outcomes successfully, the index may be break up into partitions. These partitions are referred to as shards in OpenSearch Service. Every doc is routed to a shard that’s calculated, by default, by utilizing a hash of that doc’s ID.

A shard is each a unit of storage and a unit of computation. OpenSearch Service distributes shards throughout nodes in your cluster to parallelize index storage and processing. In the event you add extra nodes to an OpenSearch Service area, it robotically rebalances the shards by shifting them between the nodes. The next determine illustrates this course of.

Diagram showing how source documents are indexed and partitioned into shards.

As storage, main shards are distinct from each other. The doc set in a single shard doesn’t overlap the doc set in different shards. This method makes shards impartial for storage.

As computational items, shards are additionally distinct from each other. Every shard is an occasion of an Apache Lucene index that computes outcomes on the paperwork it holds. As a result of all of the shards comprise the index, they need to perform collectively to course of every question and replace request for that index. To course of a question, OpenSearch Service routes the question to a knowledge node for a main or duplicate shard. Every node computes its response domestically and the shard responses get aggregated for a remaining response. To course of a write request (a doc ingestion or an replace to an present doc), OpenSearch Service routes the request to the suitable shards—main then duplicate. As a result of most writes are bulk requests, all shards of an index are usually used.

The 2 several types of shards

There are two sorts of shards in OpenSearch Service—main and duplicate shards. In an OpenSearch index configuration, the first shard rely serves to partition information and the duplicate rely is the variety of full copies of the first shards. For instance, should you configure your index with 5 main shards and 1 duplicate, you should have a complete of 10 shards: 5 main shards and 5 duplicate shards.

The first shard receives writes first. The first shard passes paperwork to the duplicate shards for indexing by default. OpenSearch Service’s O-series cases use section replication. By default, OpenSearch Service waits for acknowledgment from duplicate shards earlier than confirming a profitable write operation to the consumer. Major and duplicate shards present redundant information storage, enhancing cluster resilience in opposition to node failures. Within the following instance, the OpenSearch Service area has three information nodes. There are two indexes, inexperienced (darker) and blue (lighter), every of which has three shards. The first for every shard is printed in purple. Every shard additionally has a single duplicate, proven with no define.

Diagram showing how shards and replica shards are distributed between 3 Opensearch instances.

OpenSearch Service maps shards to nodes primarily based on plenty of guidelines. Probably the most primary rule is that main and duplicate shards are by no means put onto the identical node. If an information node fails, OpenSearch Service robotically creates one other information node and re-replicates shards from surviving nodes and redistributes them throughout the cluster. If main shards fail, duplicate shards are promoted to main to stop information loss and supply steady indexing and search operations.

So what number of shards? Deal with storage first

There are three forms of workloads that OpenSearch customers usually preserve: seek for purposes, log analytics, and as a vector database. Search workloads are read-heavy and latency delicate. They’re usually tied to an utility to boost search functionality and efficiency. A typical sample is to index the info in relational databases to offer customers extra filtering capabilities and supply environment friendly full textual content search.

Log workloads are write-heavy and obtain information repeatedly from purposes and community gadgets. Sometimes, that information is put right into a altering set of indexes, primarily based on an indexing time interval like each day or month-to-month relying on the use case. As a substitute of indexing primarily based on time interval, you need to use rollover insurance policies primarily based on index measurement or doc rely to ensure shard sizing finest practices are adopted.

Vector database workloads use the OpenSearch Service k-Nearest Neighbor (k-NN) plugin to index vectors from an embedding pipeline. This allows semantic search, which measures relevance utilizing the that means of phrases somewhat than precisely matching the phrases. The embedding mannequin from the pipeline maps multimodal information right into a vector with doubtlessly 1000’s of dimensions. OpenSearch Service searches throughout vectors to supply search outcomes.

To find out the optimum variety of shards on your workload, begin together with your index storage necessities. Though storage necessities can differ extensively, a common guideline is to make use of 1:1.25 utilizing the supply information measurement to estimate utilization. Additionally, compression algorithms default to efficiency, however will also be adjusted to scale back measurement. On the subject of shard sizes, take into account the next primarily based on the workload:

  • Search – Divide your whole storage requirement by 30 GB.
    • If search latency is excessive, use a smaller shard measurement (as little as 10GB), rising the shard rely and parallelism for question processing.
    • Rising the shard rely reduces the quantity of labor at every shard (they’ve fewer paperwork to course of), but additionally will increase the quantity of networking for distributing the question and gathering the response. To steadiness these competing issues, look at your common hit rely. In case your hit rely is excessive, use smaller shards. In case your hit rely is low, use bigger shards.
  • Logs – Divide the storage requirement on your desired time interval by 50 GB.
    • If utilizing an ISM coverage with rollover, take into account setting the min_size parameter to 50 GB.
    • Rising the shard rely for logs workloads equally improves parallelism. Nonetheless, most queries for logs workloads have a small hit rely, so question processing is gentle. Logs workloads work effectively with bigger shard sizes, however shard smaller in case your question workload is heavier.
  • Vector – Divide your whole storage requirement by 50 GB.
    • Decreasing shard measurement (as little as 10GB) can enhance search latency when your vector queries are hybrid with a heavy lexical element. Conversely, rising shard measurement (as excessive as 75GB) can enhance latency when your queries are pure vector queries.
    • OpenSearch offers different optimization strategies for vector databases, together with vector quantization and disk-based search.
    • Okay-NN queries behave like extremely filtered search queries, with low hit counts. Due to this fact, bigger shards are likely to work effectively. Be ready to shard smaller when your queries are heavier.

Don’t be afraid of utilizing a single shard

In case your index comprises lower than the suggested shard measurement (30 GB for search and 50 GB in any other case), we suggest that you just use a single main shard. Though it’s tempting so as to add extra shards considering it’ll enhance efficiency, this method can truly be counterproductive for smaller datasets due to the added networking. Every shard you add to an index distributes the processing of requests for that index throughout an extra node. Efficiency can lower as a result of there’s overhead for distributed operations to separate and mix outcomes throughout nodes when a single node can do it sufficiently.

Set the shard rely

Whenever you create an OpenSearch index, you set the first and duplicate counts for that index. As a result of you’ll be able to’t dynamically change the first shard rely of an present index, it’s a must to make this vital configuration choice earlier than indexing your first doc.

You set the shard rely utilizing the OpenSearch create index API. For instance (present your OpenSearch Service area endpoint URL and index title):

curl -XPUT https:/// -H 'Content material-Sort: utility/json' -d   '{     "settings": {         "index" : {             "number_of_shards": 3,             "number_of_replicas": 1         }     }  }'

In case you have a single index workload, you solely have to do that one time, once you create your index for the primary time. In case you have a rolling index workload, you create a brand new index frequently. Use the index template API to automate making use of settings to all new indexes whose title matches the template. The next instance units the shard rely for any index whose title has the prefix logs (present your OpenSearch service endpoint area URL and index template title):

curl -XPUT https:///_index_template/ -H 'Content material-Sort: utility/json' -d   '{    "index_patterns": ["logs*"],    "template": {         "settings": {             "index" : {                 "number_of_shards": 3,                 "number_of_replicas": 1             }        }   } }'

Conclusion

This publish outlined primary shard sizing finest practices, however extra elements may affect the perfect index configuration you select to implement in your OpenSearch Service area.

For extra details about sharding, confer with Optimize OpenSearch index shard sizes or Shard technique. Each assets may also help you higher fine-tune your OpenSearch Service area to optimize its accessible compute assets.


In regards to the authors

Photo of Tom Burns Tom Burns is a Senior Cloud Assist Engineer at AWS and relies within the NYC space. He’s an issue professional in Amazon OpenSearch Service and engages with clients for vital occasion troubleshooting and enhancing the supportability of the service. Outdoors of labor, he enjoys taking part in along with his cats, taking part in board video games with buddies, and taking part in aggressive video games on-line.

Photo of Ron MillerRon Miller is a Options Architect primarily based out of NYC, supporting transportation and logistics clients. Ron works carefully with AWS’s Knowledge & Analytics specialist group to advertise and help OpenSearch. On the weekend, Ron is a shade tree mechanic and trains to finish triathlons.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles