We consistently strive to provide customers with unparalleled insight into their products through innovative methodologies at Rockset. To further achieve this goal, we have recently decided to improve our customer-facing query logging mechanism. Our initial attempt at implementing a question log framework was centered around a shared company called Apiserver, which provided the foundation for this effort. Upon completing an execution request for a given question, the APIserver could generate a log entry that is subsequently ingested into the _events
assortment. Notwithstanding the progress we’ve made, certain aspects have prompted a reevaluation of our question log strategy.
- Without isolation, reliance on shared companies could lead to unintended consequences, where heavy traffic from one organization may inadvertently impact question logging in other organizations, resulting in inconsistent data sets and potentially skewing overall analytics.
- In light of the issues arising from shared companies, our incomplete logging practices primarily captured question-related errors; consequently, we did not log successful or profitable query instances. Moreover, it proved challenging for us to record data on every individual incident, as this would have required significant resources and infrastructure.
- Efficient debugging of questions starts with accurate logging.
_events
The answers to every question are solely contained in their respective questions. Without a clear methodology, users were left in the dark regarding the root causes of slow-performing questions and exhausted computational resources, as log files lacked crucial information on the question’s execution plan.
Improved Question Logging
The newly designed function effectively resolves all aforementioned concerns. Here is the rewritten text:
The mechanisms responsible for managing question logs are exclusively housed within your Digital Occasion, rather than being located in one of Rockset’s shared entities. Isolation’s benefits are logged in this respect. Moreover, each question submitted will likely be robotically logged if you have already created a set with a question log supply (assuming you don’t exceed a fee limit), allowing for seamless tracking of your progress.
How Question Logs Work
Logging commences immediately following the completion of query processing. When a question is processed by the ultimate aggregator, a document containing metadata relevant to that question is generated as part of the steps executed within it. At this level, gathering data from various aggregators potentially impacted by the inquiry may also be necessary. Following this operation, the document is momentarily stored in a temporary memory reservoir. Data stored in this buffer is periodically written to Amazon S3 with a frequency of several seconds. As soon as question logs are dumped to Amazon S3, they are automatically ingested into the relevant question log collections that have been set up.
INFO vs DEBUG Logs
Initially, when crafting our mission, we envisioned ample time to integrate a question profiler seamlessly into the console experience? By providing our clients with these logs, we empower them to effectively troubleshoot and resolve issues related to query performance. Notwithstanding the requirement for advanced knowledge, it is conceivable that not every question log may contain sufficient information necessary for the profiler. To mitigate this limitation, we decided to establish two levels of query records – informational (INFO) and diagnostic (DEBUG) logs.
INFO logs are automatically generated for every query posed by your organization. The attributes consist of fundamental metadata linked to your query, yet cannot be utilized in conjunction with the question profiler. When recognizing the need for debugging flexibility, you can specify a DEBUG log threshold with your query request. If the query execution time exceeds the predefined threshold, Rockset generates both an INFO and a DEBUG log entry. Two approaches exist for defining a threshold:
-
Use the
debug_log_threshold_ms
question traceSELECT * FROM _events HINT(debug_log_threshold_ms=1000)
- Use the
debug_threshold_ms
parameter in API requests. That information is accessible for every single and individual execution request.
Since debug logs significantly outpace information logs in terms of size, the speed restrictions for debug logs are substantively reduced. Because of this, consider presenting the DEBUG log threshold only when recognizing the potential value of this data. Without proper precautions, you risk encountering speed restrictions at the most inopportune moments when trying to access DEBUG logs.
System Sources
As part of our mission, we decided to pioneer the concept of system sources. These sources ingest data sourced from Rockset. Nevertheless, not like the _events
Collections with system-sourced data are exclusively managed by your team. This feature allows for comprehensive configuration of collection settings. As we move forward, it is likely that we will expand our offerings to include additional system supply options.
Getting Began with Question Logging
To start tracking your inquiries, simply establish a dataset with a query log template. This functionality may be achieved through.
As you submit queries, Rockset will begin ingesting your question logs into this comprehensive assortment. Logs from the past 24 hours of queries will also be incorporated into this collection. Please note that it may take several minutes for a log related to a completed question to appear in your collection once the question has been accomplished.
What specific insights do our question logs reveal about user engagement with different topic areas within our online platform? When attempting to query a set, the question editor detects this action, triggering the addition of a ‘Profiler’ column in the question results table. Paperwork with a populated stats area may feature a clickable hyperlink in this specific column. Clicking on this hyperlink will open your question profile in a new tab.
Customized ingest transformations and question aliases can potentially hinder performance; therefore, we strongly advise against renaming any columns to ensure optimal results.
To gain a deeper understanding of leveraging Rockset’s Question Profiler, watch the available video.
Conclusion
This glimpse should have provided you with a swift overview of the insights that question logs can offer. While debugging query inefficiencies or investigating previously failed queries can be beneficial, leveraging question logs in Rockset is a surefire way to enhance your expertise and optimize your querying experience.