Tuesday, October 7, 2025

Bridging knowledge silos: cross-bounded context querying with Vanguard’s Operational Learn-only Information Retailer (ORDS) utilizing Amazon Redshift

Are you modernizing your legacy batch processing methods? At Vanguard, we confronted vital challenges with our legacy mainframe system that restricted our means to ship fashionable, customized buyer experiences. Our centralized database structure created efficiency bottlenecks and made it troublesome to scale companies independently for our hundreds of thousands of non-public and institutional traders.

On this submit, we present you the way we modernized our knowledge structure utilizing Amazon Redshift as our Operational Learn-only Information Retailer (ORDS). You’ll learn the way we transitioned to a cloud-native, domain-driven structure whereas preserving important batch processing capabilities. We present you the way this resolution enabled us to create logically remoted knowledge domains whereas sustaining cross-domain analytics capabilities—all whereas adhering to the ideas of bounded contexts and distributed knowledge possession.

Background and challenges

As monetary wants proceed to evolve, Vanguard is dedicated to delivering adaptable, top-notch experiences that foster long-lasting buyer relationships. This dedication spans from enhancing the non-public investor journey to bringing customized cell dashboards and connecting institutional purchasers with superior recommendation choices.

To raise buyer expertise and drive digital transformation, Vanguard has embraced domain-driven design ideas. This strategy focuses on creating autonomous groups, fostering quicker innovation, and constructing knowledge mesh structure. Central to this transformation is the Private Investor workforce’s mainframe modernization effort, transitioning from a legacy system to a cloud-based, distributed knowledge structure organized round bounded contexts – distinct enterprise domains that handle their very own knowledge. As a part of this shift, every microservice now manages its personal native knowledge retailer utilizing Amazon Aurora PostgreSQL-Suitable Version or Amazon DynamoDB. This strategy permits domain-level knowledge possession and operational autonomy.

Vanguard’s present mainframe system, constructed on a centralized Db2 database, permits cross-domain knowledge entry and integration but in addition introduces a number of architectural challenges. Although batch processes can be part of knowledge throughout a number of bounded contexts utilizing SQL joins and database operations to combine info from varied sources, this tight coupling creates vital dangers and operational points.

Challenges with the centralized database strategy embody:

  • Useful resource Competition: Processes from one area can negatively influence different domains attributable to shared compute assets, resulting in efficiency degradation throughout the system.
  • Lack of Area Isolation: Adjustments in a single bounded context can have unintended ripple results throughout different domains, growing the chance of system-wide failures.
  • Scalability Constraints: The centralized structure creates bottlenecks as load will increase, making it troublesome to scale particular person parts independently.
  • Excessive Coupling: Tight integration between domains makes it difficult to switch or improve particular person parts with out affecting your complete system.
  • Restricted Fault Tolerance: Points in a single area can cascade throughout your complete system attributable to shared infrastructure and knowledge dependencies.

To handle these architectural challenges, we selected to make use of Amazon Redshift as our Operational Learn-only Information Retailer (ORDS). The Amazon Redshift structure has compute and storage separation, which permits us to create multi-cluster architectures with a separate endpoint for every area with impartial scaling of compute and storage assets. Our resolution leverages the information sharing capabilities of Amazon Redshift to create logically remoted knowledge domains whereas sustaining the power to carry out cross-domain analytics when wanted.

Key advantages of the Amazon Redshift resolution embody:

  1. Useful resource Isolation: Every area could be assigned devoted Amazon Redshift compute assets, ensuring one area’s workload doesn’t influence others.
  2. Impartial Scaling: Domains can scale their compute assets independently based mostly on their particular wants.
  3. Managed Information Sharing: Amazon Redshift’s knowledge sharing characteristic permits safe and managed cross-domain knowledge entry with out tight coupling, sustaining clear area boundaries.

Let’s discover the completely different options we evaluated earlier than choosing ORDS with Amazon Redshift as our optimum strategy.

Options explored

We applied ORDS as our optimum resolution after conducting a complete analysis of obtainable choices. This part outlines our decision-making course of and examines the options we thought-about throughout our evaluation.

Operational Learn-only Information Retailer (ORDS):

In our analysis, we discovered that utilizing Amazon Redshift for ORDS offers a strong resolution for dealing with knowledge throughout completely different enterprise areas. It excels at managing giant volumes of information from a number of sources, offering quick entry to replicated knowledge for batch processes that require cross-bounded context knowledge, and mixing info utilizing acquainted SQL queries. The answer significantly shines in dealing with high-volume reads from our knowledge sources.

Benefits:

  • Works properly in a relational database
  • Excels at real-time entry to knowledge from a number of enterprise areas
  • Improves efficiency of batch jobs coping with giant knowledge volumes
  • Shops knowledge in acquainted desk format, accessible by way of SQL
  • Enforces clear knowledge possession, with every enterprise space answerable for its knowledge
  • Presents scalable structure that reduces the chance of single level of failure

Disadvantages:

  • Requires further knowledge validation throughout loading processes to take care of knowledge uniqueness
  • Wants cautious administration of major key constraints since Amazon Redshift optimizes for analytical efficiency
  • Could require further monitoring and controls in comparison with conventional RDBMS methods

Listed below are the opposite options we evaluated:

Bulk APIs:

We discovered that Bulk APIs offers an strategy for dealing with giant volumes of information.

Benefits:

  • Close to actual time entry to bulk knowledge by means of a single request
  • Autonomous groups have management over entry patterns
  • Environment friendly batch processing of huge datasets with multi-record retrieval

Disadvantages:

  • Every product workforce must create their very own bulk API
  • In the event you want knowledge from completely different areas, you should mix it your self
  • The workforce offering the API should make sure that it may well deal with giant quantities of requests
  • You would possibly want to make use of a number of APIs to get all the information you need
  • In the event you’re getting knowledge in chunks (pagination), you would possibly miss some info if it modifications between requests

Whereas Bulk APIs provide highly effective capabilities, we discovered they require substantial workforce coordination and cautious implementation to be efficient.

Information Lake:

Our analysis confirmed that knowledge lakes can successfully mix info from completely different elements of our enterprise. They excel at processing giant quantities of information without delay, offering search capabilities by means of unified knowledge codecs, and managing giant volumes of numerous and complicated knowledge.

Benefits:

  • Handles huge knowledge volumes effectively
  • Helps a number of knowledge codecs and buildings
  • Permits complicated analytics and knowledge science workloads
  • Supplies cost-effective storage options
  • Accommodates each structured and unstructured knowledge

Disadvantages:

  • Could not present real-time, high-speed knowledge entry
  • Requires further effort with complicated knowledge buildings, particularly these with many interconnected elements
  • Wants particular methods to prepare knowledge in a easy, flat construction
  • Calls for vital knowledge governance and administration
  • Requires specialised expertise for efficient implementation

Whereas knowledge lakes excel at big-picture evaluation of huge datasets, they weren’t optimum for our real-time knowledge wants and complicated knowledge relationships.

S3 Export/Change: 

In our evaluation, we discovered that S3 Export/Change offers a technique for sharing knowledge between completely different enterprise areas utilizing file storage. This strategy successfully handles giant volumes of information and permits simple filtering of data utilizing knowledge frames.

Benefits:

  • Supplies easy, cost-effective knowledge storage
  • Helps high-volume knowledge transfers
  • Permits simple knowledge filtering capabilities
  • Presents versatile entry management
  • Facilitates cross-region knowledge sharing

Disadvantages:

  • Not appropriate for real-time knowledge wants
  • Requires additional processing to transform knowledge into usable desk format
  • Calls for vital knowledge preparation effort
  • Lacks speedy knowledge consistency
  • Wants further instruments for knowledge transformation

Whereas S3 Export/Change works properly for sharing giant datasets between groups, it didn’t meet our necessities for fast, real-time entry or instantly usable knowledge codecs.

The next desk offers a high-level comparability of the completely different knowledge integration options we thought-about for our modernization efforts. It outlines the place every resolution is most applicable to make use of and when it may not be your best option:

Answer Bulk APIs Information Lake ORDS S3 Export/Change
When to make use of Actual-time operational knowledge is required

Fetching particular knowledge subsets

Processing giant quantities of information without delay

Many bounded context

Close to real-time entry throughout a number of bounded contexts

Massive quantity batch processing

Few bounded contextsHandling giant volumes of information

Level-in-time export is adequate

When to not use Many bounded contexts concerned Actual-time knowledge entry wanted

Structured, transactional knowledge processing

Inside a single bounded context Actual-time knowledge wants

Many bounded contexts

Desk 1: Information Integration Options Comparability

Primarily based on our comparability, we discovered ORDS to be the optimum resolution for our wants, significantly when our batch processes require entry to knowledge from a number of bounded contexts in real-time. Our implementation effectively handles giant volumes of information, considerably bettering the efficiency of our batch jobs. We selected ORDS as a result of it shops knowledge in a well-known desk format, accessible by way of SQL, making it easy and environment friendly for our groups to make use of.

The structure additionally aligns with our domain-driven design ideas by imposing clear knowledge possession, the place every bounded context maintains accountability for its personal knowledge administration. This strategy offers us with each scalability and reliability, decreasing the chance of a single level of failure.

Amazon Redshift: Powering Vanguard’s ORDS Answer

Amazon Redshift serves because the spine of our ORDS implementation, providing a number of essential options that assist our modernization objectives:

Information Sharing

Our resolution leveraged the sturdy knowledge sharing capabilities of Amazon Redshift, obtainable on each Server-based Redshift RA3 cases and Redshift Serverless choices. This performance supplied us with on the spot, safe, and dwell knowledge entry with out copies, sustaining transactional consistency throughout our surroundings. The flexibleness of similar account, cross-account, and cross-Area knowledge sharing has been significantly invaluable for our distributed structure.

Excessive Efficiency

We’ve achieved vital efficiency enhancements by means of Amazon Redshift’s environment friendly question processing and knowledge retrieval capabilities. The system successfully handles our complicated knowledge wants whereas sustaining sturdy efficiency throughout varied workloads and knowledge volumes.

Multi-Availability Zone Assist

Our implementation benefited from Amazon Redshift’s Multi-AZ assist, which maintains excessive availability and reliability for our important operations. This characteristic minimizes downtime with out requiring in depth setup and considerably reduces our threat of information loss.

Acquainted Interface

The relational setting of Amazon Redshift, related conventional databases like Amazon RDS and IBM Db2, has enabled a easy transition for our groups. This familiarity has accelerated adoption and improved productiveness, as our groups can leverage their present SQL experience. By centralizing knowledge from a number of enterprise areas in ORDS utilizing Amazon Redshift, we keep constant, environment friendly, and safe knowledge entry throughout our product groups. This setup is especially invaluable for our batch processing that requires knowledge from varied elements of the enterprise, providing us a mix of efficiency, reliability, and ease of use.

Operational Learn-only Information Retailer (ORDS) utilizing Amazon Redshift

Right here’s how our ORDS structure implements Amazon Redshift knowledge sharing to unravel these challenges:

ORDS Architecture Diagram

Determine 1: Vanguard’s ORDS Structure utilizing Amazon Redshift Information Sharing

Amazon Redshift Ingestion Sample:

We utilized Amazon Redshift’s zero-ETL performance to combine knowledge and allow real-time analytics immediately on operational knowledge, which helped scale back complexity and upkeep overhead. To enrich this functionality and to meet our complete compliance necessities that necessitate full transaction replication, we applied further knowledge ingestion pipelines.

Our knowledge ingestion technique for Amazon Redshift employs completely different AWS companies relying on the supply. For Amazon Aurora PostgreSQL databases, we use AWS Database Migration Service (AWS DMS) to immediately replicate knowledge into Amazon Redshift. For knowledge from Amazon DynamoDB, we leverage Amazon Kinesis to stream the information into Amazon Redshift, the place it lands in materialized views. These views are then additional processed to generate tables for end-users.

This strategy permits us to effectively ingest knowledge from our operational knowledge shops whereas assembly each analytical wants and compliance necessities.

Amazon Redshift Information Sharing:

We used the Amazon Redshift’s knowledge sharing characteristic to successfully decouple our knowledge producers from customers, permitting every group to function inside their very own boundaries whereas sustaining a unified and simplified ruled mechanism for knowledge sharing.

Our implementation adopted a transparent course of: as soon as knowledge is ingested and obtainable in Amazon Redshift desk format, we created views for customers to entry the information. We then established knowledge shares and granted entry to those views to client Amazon Redshift knowledge warehouses for batch processing. In our surroundings with a number of bounded contexts, we’ve established a collaborative mannequin the place customers work with varied producer groups to entry knowledge from completely different knowledge shares, every created per bounded context.

This entry remained strictly read-only—when customers must replace or write new knowledge that falls outdoors their bounded context, they have to use APIs or different designated mechanisms for such operations. This strategy has confirmed efficient for our group, selling clear knowledge possession and governance whereas enabling versatile knowledge entry throughout organizational boundaries. It simplified our knowledge administration and made certain every workforce can function independently whereas nonetheless sharing knowledge successfully.

Instance: VG couple of cross bounded context

Disclaimer: That is supplied for reference functions solely and doesn’t symbolize an actual instance.

Let’s have a look at a sensible instance: our brokerage account assertion technology course of. This cross-bounded context batch course of requires integrating knowledge from a number of sources, accessing tons of of tables and processing giant volumes of information month-to-month. The problem was to create an environment friendly, cost-effective resolution that minimizes knowledge replication whereas sustaining knowledge accessibility.ORDS proved splendid for this use case, because it offers knowledge from a number of bounded contexts with out replication, gives close to real-time entry, and permits simple knowledge aggregation utilizing SQL-like queries in Amazon Redshift.

The next diagram exhibits how we applied this resolution:

ORDS example

Determine 2: Cross-Bounded Context Instance for Brokerage Account Assertion Era

We’d like the next bounded contexts to generate brokerage statements for hundreds of thousands of our purchasers.

  1. Account:
    • Particulars: Consists of details about the consumer’s brokerage accounts, similar to account numbers, sorts, and statuses.
    • Holdings and Positions: Supplies present holdings and positions inside the account, detailing the securities owned, their portions, and present market values.
    • Stability Info: Accommodates the stability info of the account, together with money balances, margin balances, and whole account worth.
  2. Consumer Profile:
    • Private Info: Details about the consumer, similar to their title, date of delivery, and social safety quantity.
    • Contact Info: Consists of the consumer’s electronic mail deal with, bodily deal with, and telephone numbers.
  3. Transaction Historical past:
    • Transaction Data: A complete file of transactions related to the account, together with buys, gross sales, transfers, and dividends.
    • Transaction Particulars: Every transaction file contains particulars similar to transaction date, kind, amount, value, and related charges.
    • Historic Information: Historic knowledge of transactions over time, offering a whole view of the account’s exercise.

By this structure, we effectively generate correct and complete brokerage account statements by consolidating knowledge from these bounded contexts, assembly each our purchasers’ wants and regulatory necessities.

Enterprise Final result

Our journey with the Operational Learn-only Information Retailer (ORDS) and Amazon Redshift has enhanced our consumer expertise (CX) by means of improved knowledge administration and accessibility. By transitioning from our mainframe system to a cloud-based, domain-driven structure, we’ve empowered our autonomous groups and established a resilient batch structure.

This shift facilitates environment friendly cross-domain knowledge entry, maintains high-quality knowledge consistency, and offers scalability. Our ORDS implementation, supported by Amazon Redshift, gives near-real-time entry to giant knowledge volumes, guaranteeing excessive efficiency, reliability, and cost-effectiveness. This modernization effort aligns with our mission to ship distinctive, customized consumer experiences and maintain long-lasting consumer relationships.

Name to Motion

In case you are going through related challenges together with your batch processing methods, we encourage you to discover how an Operational Learn-only Information Retailer (ORDS) can rework your knowledge structure. Begin by assessing your present system’s limitations and figuring out alternatives for enchancment by means of domain-driven design and cloud-based options. Contemplate how this strategy will help you handle giant volumes of information from a number of sources, present quick entry to replicated knowledge for batch processes, and assist high-volume reads from varied knowledge sources.

Take the following step by conducting a proof of idea (POC) to judge ORDS effectiveness in attaining environment friendly cross-domain knowledge entry, bettering the efficiency of batch jobs, and sustaining clear knowledge possession inside your corporation domains. By implementing this resolution, you may improve your knowledge administration capabilities, scale back operational dangers, and drive innovation inside your group. Embrace this chance to raise your knowledge structure and ship distinctive buyer experiences.

Conclusion 

Our transition to a cloud-native, domain-driven structure with ORDS utilizing Amazon Redshift has efficiently reworked our batch processing capabilities in AWS cloud. This modernization effort has considerably enhanced the efficiency, reliability, and scalability of our batch operations whereas sustaining seamless knowledge entry and integration throughout completely different enterprise domains.

The strategic adoption of ORDS has harnessed the potential of cross-domain knowledge entry in a distributed setting, offering us with a sturdy resolution for real-time knowledge entry and environment friendly batch processing. This transformation has empowered us to raised meet the calls for of the digital age, delivering superior buyer experiences and reinforcing our dedication to innovation within the monetary companies business.


Concerning the authors

Malav Shah

Malav Shah

Malav is a Area Architect in Vanguard’s Private Investor Expertise division, with over a decade of expertise in cloud-native options. He focuses on architecting and designing scalable methods, and contributes hands-on by means of growth and proof-of-concept work. Malav holds a number of AWS certifications, together with AWS Licensed Options Architect and AWS Licensed AI Practitioner.

Timothy Dickens

Timothy Dickens

Timothy is a Senior Architect at Vanguard, specializing in superior knowledge streaming designs, AI, real-time knowledge entry, and analytics. With experience in AWS companies like Redshift, DynamoDB, and Aurora Postgres, Timothy excels in creating sturdy distributed architectures that drive innovation and effectivity. Enthusiastic about leveraging cutting-edge applied sciences, Timothy is devoted to delivering reliable, actionable knowledge that empowers assured, well timed decision-making.

Priyadharshini Selvaraj

Priyadharshini Selvaraj

Priyadharshini is a knowledge architect with AWS Skilled Providers, bringing over a decade of experience in serving to clients navigate their knowledge journeys. She focuses on knowledge migration and modernization initiatives, specializing in knowledge lakes, knowledge warehouses, and distributed processing utilizing Apache Spark. As an professional in Generative AI and agentic architectures, Priyadharshini permits clients to harness cutting-edge AI applied sciences for enterprise transformation. Past her technical pursuits, she practices yoga, performs piano and enjoys pastime baking, bringing stability to her skilled life.

Naresh Rajaram

Naresh Rajaram

Naresh is a seasoned Options Architect with over 20 years of expertise, with major focus in cloud computing and synthetic intelligence. Specializing in enterprise-scale AI implementations and cloud structure, he’s serving to clients develop and deploy superior AI options, with specific give attention to autonomous AI methods and agent-based architectures. His experience spans designing cutting-edge AI infrastructures utilizing Amazon Bedrock, Amazon Bedrock AgentCore, and cloud-native AI companies, whereas pioneering work in Agentic AI purposes and autonomous methods.

© 2025 The Vanguard Group, Inc. All rights reserved.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles