Securely unlocks real-time access to enterprise and operational knowledge, enabling instant search, monitoring, and evaluation for applications such as software monitoring, log analytics, observability, and website search.
On this page, we explore the OR1 scenario type, an OpenSearch-optimized scenario.
The OR1 occasion type offers retailers a cost-effective way to handle large volumes of data. A website utilizing OR1 scenarios leverages volumetric storage for primary storage, efficiently replicating knowledge in real-time as data arrives. Cases featuring OR1 design exhibit significantly enhanced indexing performance with impressive durability.
To learn more about OR1, visit our website at .
While writing to an index, we recommend retaining a single duplicate instance. Notwithstanding this, it is feasible to switch to zero replicas following a rollover when the index is no longer being actively updated.
This execution may be carried out safely due to the information being stored reliably within Amazon S3 for durability.
When a node fails and is substituted, it is possible that your data may be robotically restored from Amazon S3; however, this process can result in partial unavailability during the restore operation. Consequently, you should not rely on this mechanism for applications that require high availability, particularly when searches are performed on non-actively written indices.
Objective
Here is the rewritten text:
This blog post explores how OR1 enhances the performance of OpenSearch workloads.
By eliminating the need for full-table scans through phase replication, OR1 cases efficiently utilize CPU resources by only indexing on the initial shards. By achieving this, nodes can efficiently utilize excess computational resources to integrate additional knowledge, or leverage reduced indexing sources to preserve more capacity for searches and other operations.
Let’s run performance tests on a data-intensive task and assess the impact of indexing on its efficiency.
Historically, r6g instances have been an excessively performant choice for indexing-heavy workloads, leveraging Amazon EBS storage capabilities. Imagine scenarios that showcase native NVMe SSDs providing exceptional throughput and minimizing latency for disk writes.
Let’s assess the efficiency of OR1 indexing techniques in comparison with the two occurrence scenarios, focusing specifically on indexing efficiency for the purposes of this blog post.
Setup
To assess our performance, we configure a series of components in accordance with the specifications outlined below.
For the testing course of:
The initialization process involves an index mapping, defined as follows:
As you may note, we’re leveraging a simplified rollout configuration to maintain the primary shard size under 50 GiB, in accordance with best practices.
We refined the mapping to avoid unnecessary indexing efforts and utilized the subject category to prevent.
The reference coverage utilized is as follows:
Our standard document measurement is approximately 1.6 kilobytes. The typical volume of paperwork in a single bulk is around 4,000 documents, resulting in an uncompressed total size of roughly 6.26 megabytes per bulk.
Testing protocol
The protocol’s key parameters are outlined below:
- A plethora of understanding hubs: 6 or 12?
- Jobs parallelism: 75, 40
- The major shard dependency for 12 nodes is as follows: 12, 48, 96.
- Replica inventory: one set of duplicate items, comprising a pair of identical sets.
- Occasions provides a range of instances, each equipped with 16 virtual CPUs, ideal for applications requiring significant processing power.
- or1.4xlarge.search
- r6g.4xlarge.search
- im4gn.4xlarge.search
or1-target | or1.4xlarge.search | 16 | 128 | 32 |
im4gn-target | im4gn.4xlarge.search | 16 | 64 | 32 |
r6g-target | r6g.4xlarge.search | 16 | 128 | 32 |
Note that the Im4gn cluster retains approximately half the reminiscence of its peer clusters, yet all settings maintain a consistent JVM heap size of around 32 gigabytes.
Efficiency testing outcomes
To test processing efficiency, we initiated a parallel job load of 75 units and processed 3,000 files per customer across 250,000 total records. Subsequently, we refined the configurations for shards, knowledge nodes, replicas, and job parameters.
Configuration options for scalable knowledge management: What would you like to know about your network architecture?
In this configuration, we employed six knowledge nodes, twelve primary shards, and a single duplicate, observing notable efficiency gains.
im4gn-target | 89-97% | 34 min | 110 kdoc/s | 172 MiB/s |
r6g-target | 88-95% | 34 min | 110 kdoc/s | 172 MiB/s |
Noted that the ‘r6g’ cluster exhibits unusually high CPU usage, leading to document rejection due to triggered issues.
The OR1 demonstrates exceptional performance, with its CPU usage consistently below 80 percent, achieving an outstanding benchmark.
Issues to bear in mind:
- When implementing manufacturing processes, ensure that you incorporate retry mechanisms with exponential backoff strategies to prevent the loss of critical documents due to occasional rejection issues and maintain uninterrupted workflow.
- Although the majority of indexing operations return a 200 OK status code, it is possible that some requests may still experience partial failures. The physical validation of the response requires a meticulous examination to confirm that all necessary documents are accurately and efficiently listed.
When the number of parallel jobs is reduced from 75 to 40 while maintaining a workload of 750 batches of 4,000 documents per user, yielding a total of 120 million documents.
20 min | 100 kdoc/s | 156 MiB/s | ||
im4gn-target | 75-93% | 19 min | 105 kdoc/s | 164 MiB/s |
r6g-target | 77-90% | 20 min | 100 kdoc/s | 156 MiB/s |
Although throughput and CPU utilisation declined, CPU usage remained persistently high on Im4gn and R6g, while OR1 displayed a surplus of available processing power.
Configuration: 6 Knowledge Nodes, 48 Main Shards, and 1 Duplicate?
With this configuration, we increased the number of primary shards from 12 to 48, thereby providing additional parallel processing capabilities during indexing.
im4gn-target | 67-95% | 34 min | 110 kdoc/s | 172 MiB/s |
r6g-target | 70-88% | 37 min | 101 kdoc/s | 158 MiB/s |
The indexing throughput was significantly enhanced for the OR1, while the Im4gn and R6g did not experience an improvement due to their still elevated CPU usage.
By reducing the number of parallel jobs to 40 and preserving 48 primary shards, we observe that OR1 will experience increased stress due to the rise in minimum CPU from 12 primary shards, while R6g’s CPU appears significantly improved. For the Imagination nonetheless, the processor still proves to be overly demanding.
im4gn-target | 80-94% | 18 min | 111 kdoc/s | 173 MiB/s |
r6g-target | 70-80% | 21 min | 95 kdoc/s | 148 MiB/s |
What are the implications of this configuration on performance and scalability?
With this specific setup, we started by leveraging a one-of-a-kind configuration and subsequently infused additional processing power, transitioning from 6 nodes to 12 while expanding the range of primary shards to 96.
im4gn-target | 74-90% | 20 min | 187 kdoc/s | 293 MiB/s |
r6g-target | 60-78% | 24 min | 156 kdoc/s | 244 MiB/s |
The OR1 and R6g systems operate effectively, with CPU usage remaining below 80%, as the OR1 achieves a 33% increase in efficiency while utilizing 30% less CPU resources compared to the R6g.
The Imagination remains at 90% CPU, yet its efficiency is still outstanding.
By reducing the number of parallel jobs from 75 to 40, we achieve:
r6g-target | 60-77% | 12 min | 167 kdoc/s | 260 MiB/s |
By reducing the number of parallel jobs from 75 to 40, we achieved parity between the OR1 and Im4gn cases, with the R6g case closely following suit.
Interpretation
The OR1 cases accelerate velocity due to solely the initial shards being written, whereas duplicates arise from copied segments? While achieving superior performance compared to IMG4N and R6G scenarios, the CPU utilisation can be reduced, thereby freeing up resources for additional loads (such as search queries) or cluster measurements.
We can evaluate the performance of an OR1 cluster comprising six nodes and 48 primary shards, which indexes approximately 178,000 documents per second, compared to a 12-node Im4gn cluster featuring 96 primary shards, capable of indexing around 187,000 documents per second, or a 12-node R6g cluster with the same number of primary shards, achieving an indexing rate of roughly 156,000 documents per second.
While the OR1 exhibits performance equivalent to the larger Im4gn cluster, and superior to the larger R6g cluster,
When utilizing OR1 cases for measurement purposes, it’s essential to consider the following factors:
Length and width of the OR1 case: Ensure you accurately measure both dimensions to guarantee a proper fit. Typically, the length is measured from the top to the bottom, while the width is measured across the case from one end to the other.
Gauge and tolerances: Familiarize yourself with the gauge and tolerances specified by the manufacturer or industry standards. This will help you determine the acceptable deviations in measurement for a precise fit.
Standardized dimensions: Compare your measurements against standardized dimensions, such as those provided by the National Institute of Standards and Technology (NIST), to ensure consistency and accuracy.
Utilize precision tools: Employ precision instruments like calipers or micrometers to obtain accurate measurements.
According to the results, the OR1 cases exhibit increased knowledge transfer at faster throughput rates. Despite the increase in main shard count, their performance suffers owing to limitations imposed by distant storage capabilities.
To achieve optimal throughput for the OR1 occasion type, utilize larger batch sizes and implement Index State Management (ISM) coverage that periodically rolls over indices based on metrics, thereby limiting the number of primary shards per index effectively. By leveraging the OR1 occasion type, you can effectively manage more parallelism and foster a greater range of connections.
While OR1 may initially appear to have a negligible impact on search efficiency. Notwithstanding what’s evident, CPU usage exhibits a decline in OR1 scenarios compared to Im4gn and R6g scenarios. That enables either additional exercise, such as searching and ingesting, or the potential to downsize the event dimension or dependency, potentially resulting in a cost reduction.
Conclusion and suggestions for OR1
The innovative OR1 event type offers significantly enhanced indexing capabilities compared to other event types. To effectively manage indexing-intensive workloads, consider implementing batch indexing processes that run daily or optimizing your system to support high, sustained query volumes.
The OR1 occasion, in turn, offers a value premium due to its 30% higher value proposition compared to current occasion types. The impact on efficiency decreases when including numerous duplicates, as the CPU’s processing speed remains unaffected in most cases; however, varying scenarios can still lead to reduced indexing performance.
Optimize Your Workload for Indexing by Leveraging This Effective Approach
As a principal AWS specialist options architect, He assists clients in crafting scalable solutions for real-time knowledge and search workflows. When he’s not occupied with other responsibilities, he dedicates himself to learning new linguistic skills through self-study and fine-tunes his musical abilities by practicing the violin.