Saturday, December 14, 2024

What’s the most effective way to conduct load testing using Rockset?

Load testing determines an application’s performance under varying levels of traffic, simulating real-world usage to ensure reliability, scalability, and stability. It matters because identifying bottlenecks and optimizing for heavy loads prevents crashes, downtime, and user frustration – critical in today’s fast-paced digital landscape where even minor hiccups can impact business reputation and revenue.

Load testing is a crucial step in ensuring the scalability and performance of any database or information service, including those powered by Rockset. By conducting load testing, our goal is to assess the system’s performance under both normal and peak conditions. This course helps in evaluating essential metrics such as Queries Per Second (QPS), concurrency, and question latency. Sizing compute assets effectively hinges on a thorough comprehension of relevant metrics, ensuring that resources are adequately provisioned to handle projected workloads seamlessly. By doing so, organizations can effectively meet Service Level Agreement (SLA) targets and provide users with a seamless, unobstructed experience. It’s crucial to provide a seamless user experience in customer-facing applications where end-users expect a smooth and intuitive interface. Load testing is commonly referred to as performance or stress testing, rather than efficiency.

According to studies, a staggering 53% of website visitors tend to abandon their browsing sessions if webpage loading times exceed a mere three seconds.

Rockset computes assets, known as digital scenarios or Virtual Instances (VIs), are available in a range of sizes, from Small to 16XL, each with a pre-defined set of vCPUs and memory availability. Determining the optimal dimension selection depends on factors such as the complexity of the questions being asked, the dimensionality of the underlying dataset, and the selectivity of anticipated queries, taking into account both concurrent query variety and latency targets for efficient question answering. If your virtual interface (VI) supports ingestion, you should consider the types of assets required to handle both ingestion and indexing concurrently with query execution. Luckily, our platform offers two alternatives that might help.

  • – Leveraging its characteristic, Rockset seamlessly scales the VI’s performance up or down in real-time, dynamically adjusting to changing workload demands. When working with varying loads and utilising a vertical instance (VI) for each ingestion and querying process, this flexibility becomes crucial.
  • By allowing you to build VIs exclusively focused on processing queries, you ensure all available resources are optimized for efficient query execution. This allows for the potential isolation of queries from ingest or the segregation of disparate apps onto distinct Virtual Instances (VIs), thereby ensuring both scalability and efficiency.

We recommend conducting load testing in at least two distinct digital scenarios: one featuring primary VI-based ingestion, and another with ingestion taking place on a separate test VI. This decision is crucial in determining whether to opt for a standalone VI or a more comprehensive, modular approach.

Load testing enables us to establish the performance limits of a chosen Value-In-Use (VI) for a specific application, thereby informing our decision-making regarding an acceptable VI dimension that can effectively manage the desired workload.

Instruments for load testing

When evaluating load testing tools, several popular options include JMeter, k6, Gatling, and Locust? Each instrument boasts distinct advantages and limitations.

  • A versatile and user-centric tool featuring a graphical interface, particularly well-suited for diverse types of load testing scenarios, albeit potentially requiring substantial system resources.
  • Optimized for peak performance and scalable cloud infrastructure, leveraging JavaScript for dynamic scripting, ideal for agile developers, continuous integration, and DevOps pipelines.
  • Exemplary high-performance device crafted with Scala, ideally suited for tackling complex and sophisticated scripting scenarios.
  • Developed using Python, this tool excels in simplifying code development and accelerating script enhancement, making it an ideal choice for effortless testing endeavors.

Each device offers a distinct array of possibilities, with the chosen configuration dependent on the specific requirements of the load test in question. Regardless of the device used, ensure that you thoroughly review the documentation to understand how it functions and measures latency and response times. When conducting load testing with JMeter, it’s essential to test each instrument separately to ensure reproducible and reliable results that can be effectively communicated to your team or stakeholders.

Rockset offers a robust REST API for executing queries, allowing seamless interaction with all the aforementioned tools which can be leveraged to access and utilize REST API endpoints seamlessly.

To accompany my blog post on load testing Rockset with Locust, I’ll also provide valuable resources for JMeter, k6, and Gatling users.

Can you successfully integrate Rockset with Locust to streamline your load-testing process? To do so, first ensure both tools are installed on the same machine. Next, create a new user in Rockset by navigating to its dashboard and clicking “Create User”. Note down the API key generated for this user; it will be used later.

Let’s optimize data ingestion into Rockset for pattern-based SQL queries efficiently. We typically begin by transforming the query into a RESTful API, making it easily verifiable and testable. The database schema can be parameterized and version-controlled, allowing for efficient management of SQL queries without requiring frequent updates to load testing scripts when changes are needed.

Establishing the question you wish to load checks involves defining a clear and concise query that can be efficiently processed by the system. This step is crucial in ensuring accurate results and minimizing potential errors. Therefore, it is essential to craft a well-structured and specific inquiry that effectively captures the intended data. For instance, when loading check questions into an assessment platform, you should formulate queries that are easy to understand, unambiguous, and relevant to the topic or skill being evaluated.

To identify the top-selling item on our online store for a particular date, What appear to be like words that describe our SQL questions :date Is a configurable option that users can specify when running the query.

SELECT 
    s.Date, 
    MAX(CASE WHEN p.ProductName IS NOT NULL THEN p.ProductName ELSE '' END) OVER (PARTITION BY s.Date ORDER BY s.Depend DESC) AS ProductName,
    MAX(s.Depend) AS NumberOfClicks
FROM "Demo-Ecommerce".ProductStatsAlias s
INNER JOIN "Demo-Ecommerce".ProductsAlias p ON s.ProductID = CAST(p._id AS INT)
WHERE s.Date = :date;

Save your question as a well-defined and structured Question Lambda that accurately captures the essence of your inquiry.

The team’s data analytics strategy will revolve around a novel methodology we’ll develop and deploy, dubbed. LoadTestQueryLambda Which can subsequently serve as a REST endpoint:

curl --request POST 
--url https://api.usw2a1.rockset.com/v1/orgs/self/ws/sandbox/lambdas/LoadTestQueryLambda/tags/newest 
-H "Authorization: ApiKey $ROCKSET_APIKEY" 
-H 'Content material-Sort: utility/json' 
  -d '{
    "parameters": [
      {
        "name": "days",
        "type": "int",
        "value": "1"
      }
    ],
      "virtual_instance_id": "<your digital occasion ID>"
  }' 
 | python -m json.device

To create an API key for your application, navigate to your project’s settings and select the “API keys” tab. From there, click on the “Create API key” button and follow the prompts to generate a new key. Be sure to save your new API key securely, as it will be used to authenticate your requests to the API.

Now, we’ll create an authentication mechanism that will be used as an approach in our Locust script to securely connect with Rockset and execute checks. You can obtain a simple API key through our console or via the API.

Create a digital scenario that mimics real-world usage patterns to simulate various loads and stresses on your application. This could include scenarios such as:

Please enter the ID of the digital event you would like to load and check. To validate the performance of our setup, we plan to conduct a load test in conjunction with a dedicated Rockset digital event focused exclusively on query execution. We launch a supplementary Medium digital event for this.

As soon as the VI is created, its identifier becomes accessible in the console.

Locate the downloaded directory of the Locust project and navigate to the folder named ‘locustfile.py’. This is where you will find your test script. Make sure that this file has the correct permissions by running the command “chmod +x locustfile.py”. Now, execute the file using the command “python locustfile.py -f locustfile.py –master” to start a master node for Locust.

We will subsequently set up and configure Locust. You are able to deploy this capability on either your native machine or a dedicated instance, such as an Amazon Elastic Compute Cloud (EC2) in the Amazon Web Services (AWS) cloud.

$ pip set up locust

As you’re working through this exercise, remember that a well-crafted Locust test is crucial for ensuring the performance and scalability of your application. Here’s what you need to do in Step 6:

As soon as that’s accomplished, we’ll develop a Python script for the Locust load test, which anticipates a seamless integration with the existing infrastructure. ROCKSET_APIKEY The atmosphere variable should be set with our API key from step three.

We will use the script beneath as a template:

import os
from locust import HttpUser, activity, tag
from random import randrange

class query_runner(HttpUser):
    ROCKSET_APIKEY = os.getenv('ROCKSET_APIKEY') # API key's an atmosphere variable

    header = {"authorization": "ApiKey " + ROCKSET_APIKEY}

    def on_start(self):
        self.headers = {
            "Authorization": "ApiKey " + self.ROCKSET_APIKEY,
            "Content material-Sort": "utility/json"
        }
        self.consumer.headers = self.headers
        self.host="https://api.usw2a1.rockset.com/v1/orgs/self" # change this along with your area's URI
        self.consumer.base_url = self.host
        self.vi_id = '<your digital occasion ID>' # change this along with your VI ID

    @tag('LoadTestQueryLambda')
    @activity(1)
    def LoadTestQueryLambda(self):
        # utilizing default params for now
        information = {
            "virtual_instance_id": self.vi_id
        }
        target_service="/ws/sandbox/lambdas/LoadTestQueryLambda/tags/newest" # change this along with your question lambda
        consequence = self.consumer.submit(
            target_service,
            json=information
        )

The load check process verifies that your data import was successful and accurately loaded into Power BI. This is done by running a quick analysis on the data to identify any errors or inconsistencies. If there are no issues, you will see a confirmation message indicating that the load check was successful.

As soon as you set the API key for the atmosphere variable, you’re able to run the Locust environment.

export ROCKSET_APIKEY=<your api key>
locust -f my_locust_load_test.py --host https://api.usw2a1.rockset.com/v1/orgs/self

And navigate to: http://localhost:8089 Where can we start our Locust load test?

The moment our fingers make contact with the keyboard. Begin swarming button:

  1. Locust initiates the creation of virtual consumers, up to the designated quantity, at a predetermined spawn price. The customers are instances of the Person class as defined in the provided Locust script. As we embark on this exercise, we start with a solitary individual and subsequently manually scale up to five and ten clients before decreasing back down to five and one again.
  2. As programmed, every digital persona initiates its designated tasks in accordance with the predetermined script. In Locust, tasks can take various forms, including HTTP requests, but are also frequently implemented as custom Python code. Duties are typically selected at random or prioritized according to their pre-designated weights. What’s driving our success is having only one question that we’re actively pursuing? LoadTestQueryLambda).
  3. Because digital customers execute tasks efficiently, Locust tracks and analyzes performance metrics to optimize outcomes. The metrics encompass a range of key performance indicators, including the volume of requests, the rate of requests per second, response times, and the number of failures encountered.
  4. The Locust’s intuitive internet interface provides real-time updates, showcasing detailed statistics. The table comprises a diverse range of clients currently present, along with the requested price, failure rate, and response times.
  5. Locust will continuously generate customers until it has reached the total number of customers requested. As the load increases incrementally, allowing for a gradual assessment of system efficiency modifications in real-time. As seen in the graph below, a notable increase in customer numbers becomes apparent, reaching peaks of five and ten before trending downward again.
  6. Digital customers eagerly anticipate receiving notifications at unpredictable intervals, consistent with the schedule specified in our communication. wait_time within the script. This simulation of extra reasonable person conduct serves to encourage individuals to behave in a more considerate and thoughtful manner. While we may not have applied this approach to our specific scenario, it’s notable that Locust offers a range of advanced features, including customizable load shapes, which can facilitate the creation of tailored test scenarios.
  7. The check will continue operating until you choose to stop it or reach the pre-set deadline if you’ve specified one.
  8. Throughout this course, Locust leverages your machine’s resources to mimic customer behavior and issue simulated requests. The efficacy of the Locust test is contingent upon the resources available on the machine in operation.

Let’s now interpret the outcomes we’re actually seeing.

Validation of load testing results necessitates a meticulous approach to ensure accuracy and reliability. A thorough examination of the test’s objectives, scope, and methodology is crucial in deciphering the outcomes effectively.

Analyzing outcomes from a Locust run requires grasping essential metrics and deciphering their implications regarding the efficacy of the underlying system under scrutiny. Among the most crucial metrics provided by Locust lie a few key indicators that, when interpreted correctly, offer valuable insights into the performance of your application. These metrics include…

  • The diversity of simulated customers across all levels during the check process. Understanding this metric enables you to gain insight into the current performance of your system’s workload. As customer numbers increase, how does system efficiency decline?
  • The throughput of queries processed by your system each second? The Better RPS Indicates the Next Load. The system’s performance is thoroughly evaluated in terms of response time and error rates to determine its ability to handle concurrent requests and high traffic volumes effectively.
  • Typically presented in the form of common, median, and percentile formats (i.e., ninetieth, ninety-fifth, and ninety-ninth percentiles). You’ll likely consider examining median statistics, as well as the 90th and 99th percentiles, which will provide insight into the experience of most customers – only a small percentage, typically around 10% to 1%, may report worse outcomes.
  • Error Rate: The percentage of requests that failed to process successfully. An excessively high failure rate often indicates problems with the underlying system being tested. It’s crucial to delve into the nature and root causes of these errors.

Below, we present the aggregate Response Per Second (RPS) and response times under various user loads during our load testing, showcasing the performance metrics from a solitary individual to 10 concurrent users and back down again.

While our real-time processing system (RPS) peaked at approximately 20, it consistently maintained a median question latency below 300 milliseconds and a P99 latency of 700 milliseconds.

We will then correlate these information factors with existing digital event metrics within Rockset. Below, you may observe how the digital event manages the load through CPU, memory, and query latency. A noticeable correlation exists between the diversity of customers from Locust and the spikes observed in our VI utilization graphs. As you monitor the system’s performance, you’ll likely start to notice a surge in question latency, with metrics like requests or queries per second gradually increasing in tandem. The CPU usage remains below 75%, with memory utilization looking stable. Without vital queuing occurring in Rockset?

You can also interpret and analyze the precise SQL queries that were executed, including their individual performance, queue time, and more, beyond viewing these metrics within the Rockset console or through our API. Before proceeding, allowing for this will enable us to accurately calculate median run and queue times by determining the optimal order in which tasks should be executed.

SELECT
    query_sql,
    COUNT(*) as depend,
    ARRAY_SORT(ARRAY_AGG(runtime_ms)) [(COUNT(*) + 1) / 2] as median_runtime,
    ARRAY_SORT(ARRAY_AGG(queued_time_ms)) [(COUNT(*) + 1) / 2] as median_queue_time
FROM
    commons."QueryLogs"
WHERE
    vi_id = '<your digital occasion ID>'
    AND _event_time > TIMESTAMP '2023-11-24 09:40:00'
GROUP BY
    query_sql

We’ll re-run this load test on our primary virtual instance to gauge its performance under heavy usage, evaluating its ability to ingest large data sets and handle concurrent query executions. The method remains unchanged; we would merely utilise a unique VI identifier within our Locust script in Step 6.

Conclusion

In abstract, load testing is a vital component ensuring the reliability and performance of any database response, including Rockset’s capabilities. By selecting the optimal load testing device and configuring Rockset effectively for load testing, you can gain valuable insights into how your system will perform under diverse scenarios.

While Locust is easy to get started with quickly, Rockset’s REST API support for executing queries and lambda functions makes it simple to integrate with any load testing tool.

The primary goal of load testing isn’t solely to ascertain the maximum load a system can withstand, but rather to understand its behavior under various levels of stress and ensure it meets the necessary performance criteria as well?

Shouldn’t performance testing of our website’s speedy loading commence before we even complete its development?

  • Before moving into production.
  • To simplify querying data in Rockset, you can easily parameterize, version-control, and even expose your queries as RESTful API endpoints.
  • To conduct load testing on a digital platform focused on query handling, utilizing your primary ingestion VI.
  • Rockset provides a robust feature set to maintain statistics of executed queries, empowering data analysts and engineers to make data-driven decisions with confidence.
  • You’re seeking high efficiency – if so, several straightforward approaches can help; we’ll explore them in a forthcoming blog.

Have enjoyable testing 💪

Helpful assets

Essential resources for load testing with JMeter, Gatling, and k6: The process is analogous to our collaboration on Locust; consider obtaining an API key, authenticating with Rockset, and subsequently invoking the lambda REST endpoint for a designated digital event.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles