Tuesday, April 1, 2025

What are some key takeaways from leveraging Amazon OpenSearch Serverless for a massive 30TB time series workload? Firstly, the scalability and cost-effectiveness of OpenSearch Serverless proved to be game-changers. By utilizing the serverless architecture, we were able to seamlessly handle the enormous data load without worrying about provisioning or managing infrastructure. Furthermore, the seamless integration with AWS services like Amazon Kinesis Firehose enabled us to efficiently ingest and process vast amounts of time-series data in real-time. This allowed for near-instant querying and analysis capabilities, providing valuable insights into our system’s performance and behavior. Another significant advantage was the ease of implementation and reduced operational overhead. With OpenSearch Serverless, we eliminated the need for manual scaling, patching, and maintenance, freeing up our team to focus on higher-level tasks like data visualization, analytics, and business insights. Moreover, the ability to define custom metrics and queries using OpenSearch’s Query DSL enabled us to drill down into specific aspects of our system’s behavior, providing a deeper understanding of performance bottlenecks and areas for optimization. In addition, the seamless integration with other AWS services such as Amazon QuickSight allowed us to easily create interactive dashboards and reports, making it possible to visualize and analyze our time-series data in real-time. Lastly, the cost-effectiveness of OpenSearch Serverless proved to be a significant advantage. By only paying for what we use, we were able to maintain a highly scalable and performant system without breaking the bank.

In today’s data-driven landscape, effectively managing and analyzing enormous volumes of knowledge – including logs – is crucial for organizations seeking to glean valuable insights and inform informed decisions. Despite the proliferation of massive data sets, extracting meaningful insights remains a significant challenge, driving organizations to seek scalable solutions that circumvent the complexities associated with infrastructure management.

 Our solution significantly streamlines the process of provisioning and scaling guide infrastructure, allowing you to efficiently ingest, analyze, and visualize your time-series data while minimizing administrative burdens and empowering data-driven decision making by extracting valuable insights from your information.

We’ve recently launched a new storage capacity level of 30TB for time-series data per account per AWS Region. The OpenSearch Serverless compute capability for information ingestion and search is measured in OpenSearch Compute Units (OCUs), which are allocated across multiple collections sharing a common key. OpenSearch Serverless now enables accounts to scale to support larger datasets by providing up to 500 Open Computing Units (OCUs) per area, with a separate allocation for both indexing and searching, more than doubling the previous limit of 200 OCUs. With the freedom to customize OCU limits for search and indexing separately, you’ll enjoy the peace of mind that comes from expertly managing costs. You may also monitor real-time OCPU utilisation with metrics, thereby gaining a comprehensive perspective on the valuable resources consumed by your workload. With the assistance of advanced analytics tools supporting datasets up to 30 terabytes in size, organisations can gain unparalleled insights into their operations, making data-driven decisions to swiftly identify and troubleshoot utility outages, optimise system performance, and detect fraudulent activities.

This article explores methods for processing and analyzing large-scale time series data sets of approximately 30 terabytes in size using OpenSearch Serverless capabilities.

Optimizations for Enhanced Information Processing and Accelerated Responses

Abundant disk space, robust memory, and powerful processing capabilities are crucial for effectively handling large datasets and executing rigorous analyses. While these assets may be useful, they are indeed crucial to the smooth functioning of our organization. In sequential collection storage systems, the Optical Cartridge Unit (OCU) disk typically consolidates and stores less frequently accessed, older shard data – commonly referred to as heat shards. We’ve successfully introduced a cutting-edge feature known as heat shard restoration prefetching. This function proactively screens recently queried information blocks for a shard. It prioritizes these tasks throughout various shard actions, aligning with shard balancing, vertical scaling, and deployment procedures. By accelerating auto-scaling and enabling faster readiness for diverse search workloads, this enhancement significantly boosts the overall system’s performance. Later in this article, we will reveal details about the improvements made.

A select few pioneering prospects collaborated closely with us ahead of the official launch, preceding our Normal Availability. During these trials, a remarkable 66% enhancement was observed in the thermal query efficiency for certain customer workloads. This significant enhancement demonstrates the efficacy of our innovative solutions. Furthermore, we have optimized concurrency between coordinator and employee nodes, enabling additional requests to be efficiently processed as OCUs scale up automatically through auto-scaling capabilities. The enhancement has yielded a notable 10% improvement in query efficacy for warm and heat-related inquiries.

Our team has significantly bolstered the stability of our system, enabling it to efficiently manage and process vast collections of time-series data up to 30 terabytes in size without compromising performance or accuracy. Our skilled workforce is committed to optimizing system performance, evident in their sustained efforts to refine and improve the auto-scaling mechanism. The enhancements featured optimized shard distribution for precise placement following rollovers, adaptive scaling insurance policies calibrated to queue size, and a dynamic sharding approach that adjusts shard dependencies according to ingestion rates.

Here is the rewritten text in a different style:

We document an internal proof-of-concept utilizing a 30 terabyte workload, providing insight into the data ingested and generated, as well as our findings and results. Efficiency may vary depending on the specific workload.

Ingest the information

To effectively test and optimize your application, consider leveraging the pre-shared load technology scripts or utilizing your own custom-built utility or information generator to generate a realistic load. You’ll be able to execute multiple scenarios of these scripts to trigger a surge in indexing requests. Within this screenshot, it’s clear that our team conducted an exhaustive examination using an index, transmitting approximately 30 terabytes of knowledge over a 15-day period. We employed our load generation script to direct users to a solitary index page, maintaining data retention for 15 days using a suitable.

Take a look at methodology

We configure the deployment type as ‘Allow redundancy’ to ensure seamless data replication across all available availability zones. This deployment configuration is expected to yield a knowledge retention period ranging from 12 to 24 hours within scorching storage, commonly referred to as OCU disk memory, with the balance stored elsewhere. With a predetermined framework for searching efficiency and prior consumption expectations, we established a maximum of 500 OCUs per indexing and search operation.

During the testing phase, we analyzed and visualized the auto-scaling behavior to better understand its patterns. The indexing process required approximately eight hours to stabilize and reach an optimal utilization of 80 OCU.

The Search facet required approximately two days to stabilize at an optimal utilization level of 80 OCU.

Observations:

Ingestion

Ingestion efficiency consistently exceeded 2 terabytes (TB) per day.

Search

The queries had ranged in duration from a quarter hour to 15 days, spanning a significant timeframe.

{"aggs":{"1":{"cardinality":{"area":"service.key phrase"}}},"dimension":0,"question":{"bool":{"filter":[{"range":{"@timestamp":{"gte":"now-15m","lte":"now"}}}]}}}

For instance

{"aggs":{"1":{"cardinality":{"area":"service.key phrase"}}},"dimension":0,"question":{"bool":{"filter":[{"range":{"@timestamp":{"gte":"now-1d","lte":"now"}}}]}}}

The table below presents a breakdown of performance metrics across diverse groups.

The second question was

{"question":{"bool":{"filter":[{"range":{"@timestamp":{"gte":"now-15m","lte":"now"}}}],"ought to":[{"match":{"originState":"State"}}]}}}

For instance

{"question":{"bool":{"filter":[{"range":{"@timestamp":{"gte":"now-15m","lte":"now"}}}],"ought to":[{"match":{"originState":"California"}}]}}}

The following table presents a breakdown of the diverse percentile performance metrics for the search query.

The time variability of diverse queries is distilled in the following chart.

quarter-hour {“aggs”:{“1”:{“cardinality”:{“area”:”service.key phrase”}}},”dimension”:0,”question”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-15m”,”lte”:”now”}}}]}}} 325 403.867 441.917 514.75
1 day {“aggs”:{“1”:{“cardinality”:{“area”:”service.key phrase”}}},”dimension”:0,”question”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-1d”,”lte”:”now”}}}]}}} 7,693.06 12,294 13,411.19 17,481.4
1 hour {“aggs”:{“1”:{“cardinality”:{“area”:”service.key phrase”}}},”dimension”:0,”question”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-1h”,”lte”:”now”}}}]}}} 1,061.66 1,397.27 1,482.75 1,719.53
1 yr {“aggs”:{“1”:{“cardinality”:{“area”:”service.key phrase”}}},”dimension”:0,”question”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-1y”,”lte”:”now”}}}]}}} 2,758.66 10,758 12,028 22,871.4
4 hour {“aggs”:{“1”:{“cardinality”:{“area”:”service.key phrase”}}},”dimension”:0,”question”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-4h”,”lte”:”now”}}}]}}} 3,870.79 5,233.73 5,609.9 6,506.22
7 day {“aggs”:{“1”:{“cardinality”:{“area”:”service.key phrase”}}},”dimension”:0,”question”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-7d”,”lte”:”now”}}}]}}} 5,395.68 17,538.12 19,159.18 22,462.32
quarter-hour {“question”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-15m”,”lte”:”now”}}}],”ought to”:[{“match”:{“originState”:”California”}}]}}} 139 190 234.55 6,071.96
1 day {“question”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-1d”,”lte”:”now”}}}],”ought to”:[{“match”:{“originState”:”California”}}]}}} 678.917 1,366.63 2,423 7,893.56
1 hour {“question”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-1h”,”lte”:”now”}}}],”ought to”:[{“match”:{“originState”:”Washington”}}]}}} 259.167 305.8 343.3 1,125.66
1 yr {“question”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-1y”,”lte”:”now”}}}],”ought to”:[{“match”:{“originState”:”Washington”}}]}}} 2,166.33 2,469.7 4,804.9 9,440.11
4 hours {“question”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-4h”,”lte”:”now”}}}],”ought to”:[{“match”:{“originState”:”Washington”}}]}}} 462.933 653.6 725.3 1,583.37
7 days {“question”:{“bool”:{“filter”:[{“range”:{“@timestamp”:{“gte”:”now-7d”,”lte”:”now”}}}],”ought to”:[{“match”:{“originState”:”Washington”}}]}}} 1,353 2,745.1 4,338.8 9,496.36

Conclusion

The OpenSearch Serverless release not only expands the scope of indexed data compared to previous versions, but also introduces performance-enhancing features such as heat-shard prefetching and concurrency optimization to significantly improve query response times? These enhancements reduce latency for heat queries while improving auto-scaling capabilities to handle a range of workloads seamlessly. Discover the power of our 30TB index and put its capabilities to the test! Migrate your information to unlock improved scalability, realizing enhanced throughput and leveraging advanced performance capabilities.

To start, consult with. Gain practical experience with OpenSearch Serverless by participating in a comprehensive workshop that provides detailed, step-by-step guidance on setting up and managing an OpenSearch Serverless cluster.

Your thoughts are welcome in the comments below? You likely have questions about this publication; start a new thread on either the forum or discussion board.


Concerning the authors

Serves as a senior product supervisor for Amazon OpenSearch Service. With a focus on OpenSearch Serverless, he leverages his extensive experience in networking, cybersecurity, and artificial intelligence/machine learning. He possesses a Bachelor’s degree in Computer Science and an MBA in Entrepreneurship. In his spare moments, he enjoys flying airplanes, piloting hang gliders, and going on thrilling bicycle rides.

Serves as the Engineering Chief for Amazon OpenSearch Service. He specializes in cultivating search expertise for OpenSearch prospects. With extensive experience designing highly scalable solutions in databases, real-time streaming, and distributed computing, he excels in crafting effective architectures for complex systems. With a strong background in domains such as the Web of Things, anti-fraud security, gaming, and artificial intelligence/machine learning. When not occupied with work, he enjoys cycling, hiking, and engaging in strategic games of chess.

As a senior software program engineer at AWS, I lead the search and benchmarking efforts for the Amazon OpenSearch Serverless mission. His passion is dedicated to identifying innovative solutions for complex problems within the realm of large-scale distributed systems? When not focused on work, he finds relaxation in woodworking, taking leisurely bike rides, watching a thrilling basketball game, and cherishing quality time with loved ones – including his family and loyal canine companion.

is a Sr. Design and Implement Customizable Search Solutions using Amazon OpenSearch Service for our global customer base. With meticulous attention, he collaborates closely with clients to seamlessly transition their workloads to the cloud, expertly guiding them in optimizing their cluster configurations for enhanced performance and cost savings. Prior to joining AWS, he assisted numerous clients in leveraging OpenSearch and Elasticsearch for their search and log analytics requirements. While he’s not occupied with work, he’s likely to be found embarking on thrilling adventures, discovering hidden gems in far-flung places. He relishes the thrill of trying new foods, embarking on a culinary journey, and then repeating the process with gusto.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles