Wednesday, April 2, 2025

Rockset turbocharges real-time personalization at Whatnot by leveraging its advanced analytics and AI capabilities to generate hyper-relevant content that resonates with individual users. By processing vast amounts of data in mere milliseconds, Rockset empowers Whatnot’s recommendation engine to deliver tailored experiences that drive engagement and conversion.

Is a venture-backed e-commerce startup designed to thrive in the streaming-centric era?

We’ve developed an immersive live video marketplace catering to collectors, fashion enthusiasts, and superfans, where sellers can host live auctions of their chosen items through our cutting-edge video auction platform. Assume eBay meets Twitch.

As we went live for the first time in 2020, coveted collectibles took center stage as the premier attractions on our stream. Currently, buyers can access over 100 categories of products while dwelling, with sellers offering a diverse range of goods including Pokémon and baseball trading cards, sneakers, vintage currency, and many other collectibles.

Crucial to Whatnot’s triumph is fostering a cohesive ecosystem through its platform, effectively linking buyers and sellers in harmonious convergence. The platform continuously collects metrics in real-time from our audience: the films being watched, the viewer-generated comments and online conversations, and the products being purchased. We utilize this knowledge to generate rankings of the most popular and relevant movies, subsequently presenting them to customers through the home screen of our mobile application or website on Whatnot.

Notwithstanding, to nurture and elevate our advancement, we aimed to elevate our house feed to a new level by rating our current approaches according to the most captivating and relevant content in real-time for each individual.

To accomplish this effectively, we may need to increase our capacity for processing and analyzing vast amounts of information in real-time? To accelerate innovation, we aimed to identify a platform where data scientists and machine learning experts could collaborate efficiently, deploy models swiftly, and handle demanding, real-time workloads with minimal latency and maximum concurrency.

Excessive Value of Operating Elasticsearch

On the floor, our legacy knowledge pipeline seemed to be functioning well, built on the latest technological advancements. Content retrieval and rating are performed using batch options preloaded during ingestion. This course yields a swift response time, delivering individual answers within tens of milliseconds, and accommodating concurrent requests up to 50-100 queries per second.

Despite initial challenges, we are planning a significant expansion of our utilization capabilities, aiming for a 5-10 fold increase within the next 12 months. To further elevate our offerings, we will combine products into more comprehensive categories and enhance the astuteness of our recommendation algorithm.

What was the source of the higher pain levels in our small group? This was eroding productivity and severely constraining our ability to elevate the intelligence of our suggestion engine to keep pace with our advancements.

Let’s integrate a novel individual identifier into our analytics stream seamlessly. By leveraging our established serving infrastructure, we can efficiently transmit data via Confluent-hosted Kafka topics and ksqlDB, followed by denormalization and/or aggregation to optimize information processing. A specific Elasticsearch index needs to be manually curated or designed accordingly. Only by challenging the information itself can we truly understand its validity.

Maintaining our existing workflow required a substantial amount of energy and dedication. As our understanding evolves constantly, we have consistently incorporated fresh insights into existing databases. Replacing the relevant Elasticsearch index required a time-consuming process every time. As new knowledge assets were developed or updated, it was essential to conduct a thorough manual review of every component within the knowledge pipeline to ensure that no potential bottlenecks, knowledge gaps, or inaccuracies had been introduced.

To maximize productivity while ensuring seamless scalability and effectiveness, organizations should focus on streamlining processes, leveraging technology, and fostering a culture of continuous improvement. By doing so, they can efficiently allocate resources, optimize workflows, and drive growth without compromising quality or reliability.

Our cutting-edge real-time analytics platform has the potential to drive significant advancements, warranting a thorough examination of various options.

Here is the rewritten text:

“We developed an information pipeline leveraging Airflow to extract data from Snowflake and load it into one of our operational databases, which feeds an Elasticsearch-powered application, potentially with caching in place.” While initially considering a 5, 10, or 20-minute interval for scheduling this task, we faced challenges meeting our service level agreements due to latency issues. Additionally, the increased technical complexity hindered our development team’s velocity.

We assessed a range of solutions alongside Rockset, Materialize, and Apache Pinot. Among all these SQL-first platforms, each one met our requirements; however, we sought a partner that could also manage the operational burden effectively.

Ultimately, we selected Rockset due to its ideal blend of features supporting our growth: a fully managed, developer-centric platform offering real-time data ingestion, lightning-fast query performance, high concurrency, and automated scalability capabilities.

Let’s focus on our top priority, developer productivity, which Rockset significantly boosts through several key mechanisms. Rockset’s unique characteristics enable exhaustive field listings, including nested ones, thereby ensuring automatic query optimization and rapid execution regardless of query complexity or data structure. With modern search capabilities, we no longer need to worry about the time and labor required to build and maintain indexes, a significant improvement over the traditional approach used in Elasticsearch. Rockset’s architecture also makes SQL a first-class citizen, elevating its importance for data scientists and machine learning engineers who can now leverage the power of SQL alongside their familiar programming languages. SQL tutorials provide comprehensive menus of instructions, featuring four types of joins, search options, and aggregation techniques. Complex analytics had long been a hallmark of the company’s data-driven approach.

With Rockset, we’ve experienced a significantly accelerated growth process. We can seamlessly integrate a brand-new person, sign, or knowledge source into our rating engine without the need for preliminary denormalization. If the characteristic functions as intended and the performance is satisfactory, we can expedite the process and transition it to production within a few days. If latency is excessively high, we may need to consider denormalizing data or performing preliminary calculations upfront.

Rockset’s robust, fully managed software as a service (SaaS) platform has matured into a pioneering force within its organization. By decoupling storage from compute, Rockset enables users to store and manage large datasets independently of the compute resources that process them, allowing for greater flexibility and scalability in their data workflows. This decoupling is achieved through Rockset’s patented columnar storage technology, which allows for the efficient storage and querying of massive amounts of data without the need for complex and costly infrastructure upgrades. With Rockset, users can seamlessly integrate multiple data sources, scale their compute resources as needed, and optimize their workflows for improved performance and cost-effectiveness. This enables instant deployment of Rockset, providing automated scaling to handle fluctuations in traffic, reminiscent of sudden spikes in popularity for a successful product or streamer. With Rockset’s innovative, mutable architecture, upserting knowledge becomes effortlessly streamlined, allowing for seamless execution of inserts, updates, and deletes with equal ease.

With unparalleled efficiency, Rockset seamlessly executed true and complex queries, consistently achieving sub-50-millisecond end-to-end latency. While not an exact match for Elasticsearch, the solution still delivered significant efficiency gains and cost savings by handling a much larger volume and variety of data, enabling more sophisticated analytics – all within a SQL framework.

The Rockset product’s appeal has been multifaceted. The Rockset Engineering Group has proven to be an exceptional collaborator. Each time a challenge arose, we quickly reached out to them through Slack and promptly received a solution. They’ve become an integral part of our team, extending far beyond a traditional vendor relationship.

Actual-time uses encompass a vast array of applications that directly impact our daily lives.

We are delighted with the performance of Rockset and intend to expand its applications across various domains. Neighborhood norms and security – two slam dunk features that mirror Rockset’s existing capacity to monitor feedback and chat for offensive language.

To further enhance our analytics capabilities, we intend to leverage Rockset’s capabilities as a mini-online analytical processing (OLAP) database, empowering us to deliver real-time storytelling and interactive dashboards to our sales teams. Rockset’s real-time capabilities would distinguish it significantly from Snowflake, making it a more convenient and straightforward option for users. By leveraging the Rockset API, newly acquired knowledge is seamlessly upserted and instantly reindexed, ready for querying.

We’re further investing heavily in positioning Rockset as our primary provider of real-time capabilities for machine learning. Rockset’s real-time capabilities make it an ideal component in a machine learning pipeline, particularly when processing timely data like the dependence graph of chat interactions from the past 20 minutes in a live stream. Knowledge would seamlessly flow from Kafka to Rockset, mirroring the same logical processes that govern our batch dbt transformations within Snowflake’s robust architecture. We may someday successfully summarize the transformations to facilitate seamless integration with Rockset and Snowflake dbt pipelines, thereby enabling composability and repeatability. Knowledge scientists familiar with SQL can leverage Rockset’s capabilities to streamline their work.

We’re now a rock-solid fit for our sweet tooth cravings! In a hypothetical scenario where Rockset is centered around Whatnot, it would be beneficial if Rockset incorporated features tailored to our specific needs, such as real-time data processing capabilities similar to those found in stream processing engines, as well as approximate nearest neighbors search and auto-scaling options. Despite the effectiveness of real-time joins in many cases, there are still certain scenarios where they fall short, necessitating preliminary calculations. If we could consolidate all those features into a unified platform, rather than needing to assemble a disparate collection of technologies, that would be appealing.

What are the key aspects of a healthy living environment that we focus on in our dwelling feed? Visit our website to explore current opportunities in our engineering team.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles