As a valuable resource, information has evolved to become a crucial asset for organisations, yielding essential insights that inform strategic decision-making and optimise operational efficiency. Despite advancements, numerous organisations continue to struggle in effectively leveraging their data due to hurdles such as entrenched information silos, inadequate discoverability, subpar data quality, and insufficient knowledge literacy and analytical prowess, hindering swift access and utilisation across the organisation? To overcome burgeoning data management hurdles, Amazon Web Services (AWS) clients leverage its AI-powered information governance platform, which streamlines the process of indexing, discovering, sharing, and regulating vast amounts of data stored across AWS, on-premise environments, and external repositories.
Since its inception in 1926, De Bijenkorf has been a pioneer of Dutch retail, providing daily comfort products through its unique approach to design. With a workforce exceeding 17,000 employees, HEMA offers a wide range of uniquely designed, sustainable products across more than 750 stores in the Netherlands and internationally, including Belgium, Luxembourg, France, Germany, and Austria, with e-commerce platforms available in all these countries. Five years ago, HEMA built its initial e-commerce platform on AWS, allowing them to quickly develop innovative solutions using a range of tools and services in the cloud. Currently, a key component driving every aspect of the group is its customer-adored online cake customization feature, in addition to democratizing data to fuel business perception.
By leveraging Amazon DataZone, HEMA successfully built its information mesh, streamlining data entry across multiple business domains with enhanced efficiency and integration.
HEMA’s unique journey with Amazon DataZone is a testament to their ability to overcome crucial challenges and reap transformative benefits since its deployment in May 2024. Amazon DataZone has revolutionized HEMA’s operations by fostering a comprehensive data repository, streamlining information discovery, and empowering decentralized collaboration through governed data sharing.
Information panorama at HEMA
After transferring its total information platform from on premises to the AWS Cloud, the wave of change offered a novel alternative for the HEMA Information & Cloud operate to speculate and commit in constructing a knowledge mesh.
With a distinctive corporate architecture, HEMA’s organization is composed of distinct business units, designed to drive efficiency and strategic growth. Companies develop software programs with specific functionalities to achieve a predetermined goal within an organization. Each service resides in its own dedicated AWS account, carefully crafted and sustained by a product owner and a development team, as depicted below.
With a portfolio of over 400 businesses under its umbrella, HEMA oversees the operations of 20 organisations that manage complex Extract, Transform, Load (ETL) pipelines, leveraging dedicated data assets to create and consume valuable information assets that are seamlessly shared across the digital ecosystem.
What are the challenges of information administration in a knowledge mesh?
Several weeks into its operation, HEMA’s online information platform was a far cry from the corporate ideal. Establishing a reliable and efficient agile team, built upon solid processes, was the initial objective. Historically, information inventory silos within separate remote environments hindered swift information discovery and seamless sharing across departments, a process that was both tedious and time-consuming for all stakeholders involved.
Establishing robust information governance frameworks proves to be a challenging endeavour. Within a knowledge mesh framework, the inherent decentralization of the group significantly amplifies this complexity.
HEMA determined that effective information governance was no longer a discretionary luxury, but rather an essential cornerstone for building a robust and healthy information ecosystem.
Why HEMA chosen Amazon DataZone
Upon reviewing the preview, HEMA observed how Amazon DataZone effectively encapsulated all critical aspects of knowledge management within a comprehensive solution. While it’s intuitive that Amazon DataZone will benefit various stakeholders, clarity on exactly how this occurs could be improved for a more comprehensive understanding. To optimise the technical group’s efforts, a robust programme is essential for ensuring the reliable supply, seamless accessibility, and uncompromising quality of the information property underpinning the enterprise information catalog. Enterprise users were provided with a software solution enabling them to effortlessly access and self-serve on their desired information, seamlessly navigating through the digital mesh.
Options mirroring AI-generated metadata have been instrumental in providing end-users with reliable and use case-specific descriptions of what a specific data product can offer and address, whereas the subscription feature enabled them to begin utilizing a certain data asset within their own environment in mere seconds, thereby bypassing the traditional prolonged and human-driven process.
The underlying drivers, including the self-service capabilities, ultimately led HEMA to decide on deploying Amazon DataZone at an enterprise level.
Answer overview
The HEMA information landscape is characterized by its diversity, featuring multiple organizations employing a range of technologies and methodologies, including Databricks, to drive their efforts forward. To effectively navigate and manage the complex digital landscape, HEMA leverages a knowledge mesh architecture on Amazon Web Services (AWS). The framework upholds a core artificial intelligence platform (CIP), facilitating seamless interactions between information creators and seekers through provision of requisite infrastructure and architecture. The general construction can be succinctly represented as follows:
Services typically utilize a pair of AWS accounts: one dedicated to pre-production environments and another for production or manufacturing purposes. Before deploying changes to live operations, a thorough examination of the adjustments is possible.
At the core of this architecture lies Amazon DataZone. The platform assists HEMA in consolidating all proprietary information from various data sources into a unified catalog, thereby facilitating seamless access and management. Playing a crucial role, it acts as a key bridge between disparate approaches, much like Databricks and native AWS services do. The seamless integration of Databricks Delta tables into Amazon DataZone is accomplished via. Within the Amazon DataZone enterprise catalog, Delta tables’ technical metadata is stored locally in the Information Catalog, serving as a foundational resource for crafting properties. Entry management is ensured through Azure Information Protection, which enables robust, fine-grained control over access to sensitive data stored in the information lake, as well as secure information sharing capabilities. The diagram illustrates the information mesh structure.
Amazon’s DataZone implementation mirrors that of individual companies, such as HEMA, which leverages two distinct spatial data catalogs. preprod-hema-data-catalog
and prod-hema-data-catalog
. These digital catalogs serve as the backbone for seamless information exchange across various stages of production, facilitating flexible access to vital data tailored to specific environmental requirements.
The prod-hema-data-catalog
is the production-grade catalog
That facilitates seamless information sharing across manufacturing companies and, at times, pre-production organizations. This platform exclusively enables the creation of intellectual property by manufacturing organizations, while prohibiting the dissemination of assets owned by pre-production firms; simultaneously, it grants pre-production entities access to proprietary production-level data. The following diagram provides an illustration of the structural framework for each account.
To ensure effective isolation between separate entities within the complex network of information, a dedicated initiative focuses on establishing a pioneering service platform. The atmospheric profiles and environmental settings are specifically designed for dedicated use by this service alone. The Amazon DataZone configuration is centrally managed by the core team using a. Following configuration by central staff, task teams gain access to self-service features to establish personalized environments aligned with their specific requirements.
The following diagram outlines the comprehensive workflow for onboarding HEMA service groups within Amazon DataZone.
The workflow encompasses the following stages:
- Upon submitting a formal request, a service staff member – either a knowledge producer or consumer – initiates communication with the core information platform team to facilitate seamless information sharing across their respective service accounts. When a service staff encounters a scenario where they must simultaneously update the catalog with information for distinct audiences, or enter details provided by another staff member, this is typically the case.
- Upon receiving a request, the core information platform team promptly identifies requirements and launches the development of tasks and environments within Amazon DataZone. The infrastructure setup is accomplished using AWS CloudFormation, coupled with a seamless integration and delivery pipeline. The Core Information Platform team ensures that the correct AWS account (pre-production or production) is linked to the environment throughout the project in its corresponding catalogues.
- After arranging tasks and environments, service groups can utilize Amazon DataZone features to securely share and consume information assets.
- Service providers can upload their proprietary data to the centralized Information Catalog, exercising control over access permissions by approving or denying subscription requests from interested parties.
- Customers seeking access to this proprietary data, such as Service B, can leverage the Amazon DataZone catalogue to search and retrieve relevant information through subscription-based requests.
In a decentralised information network, the risk of unauthorised service groups creating assets in service accounts not approved for handling exists, potentially leading to governance issues and information mismanagement. To effectively address this challenge, HEMA implemented a dual-pronged approach.
- Assets managed solely by the service staff responsible for an undertaking are incorporated into every venture. Each service staff’s undertaking defines a clear and transparent framework for managing the assets entrusted to them.
- Amazon’s DataZone enables core groups to configure custom governance policies, thereby allowing them to exclusively deploy resources within their dedicated environments.
Adoption plan: Technique
Within HEMA’s information infrastructure, seamless integration with data-producing firms is crucial; consequently, the central information governance team must develop a strategic adoption plan that not only complements these organizations’ workflows but also enhances their value chain without disrupting task delivery. HEMA’s adoption methodology is founded on three fundamental principles:
- Don’t delay until you’re ready to launch a comprehensive service that encompasses every available feature simultaneously. As a professional editor, I would improve the text in the following way:
Develop a Minimum Viable Product (MVP) that addresses the most critical need of the business, making it available to the organization as soon as possible.
- HEMA’s information staff hosted internal seminars and created customized exhibits for each stakeholder group, demonstrating how Amazon DataZone would streamline their information-sharing needs. Inspire them to explore the innovative features and discover how the new capabilities can revolutionize their workflow, making it more efficient and streamlined.
- As a reflection of HEMA’s core values and mission. As prospective customers navigate the adoption process, be close by to offer support, much like HEMA stays connected with its customers as they seek innovative products that enhance their daily lives. Create area for Q&A and develop a collaborative expertise for everybody of their adoption curve.
Adoption plan: Motion factors
While implementing an adoption strategy for a decentralized data marketplace using Amazon DataZone, HEMA employed a “start small, refine, and repeat” approach. In apply, this meant that the Information & Cloud staff began working with one enterprise unit, increasing then to a number of enterprise items, whereas specializing in one single characteristic: information asset subscription. To foster greater curiosity and adoption, a pilot program was introduced to leverage underutilized core information assets within the organization.
Following a thorough understanding and adoption of the initial methodology, the next crucial step involved providing support for the information pipeline adaptation necessary to accommodate each business unit’s unique needs.
Once all groups were fully integrated with the subscription feature, HEMA shifted its focus to introducing enterprise products to its second key attribute: data dissemination.
HEMA introduced a new approach that enables domain owners to adopt new features at their own pace, allowing them to opt-in to the next iteration once they’re comfortable with the current implementation.
When adoption was at some extent the place all core information property have been being consumed by way of the Amazon DataZone catalog, the Lake Formation useful resource hyperlinks used beforehand to share information throughout accounts have been decommissioned, and on the similar time the Information & Cloud staff interrupted their obligation to share information between enterprise items, stimulating the peer-to-peer information sharing apply, the place groups can instantly discuss to one another with out having to contain a 3rd get together.
Outcomes
As adoption accelerated across the organization, all relevant business units rapidly became proficient in leveraging Amazon DataZone daily to meet their own needs through self-service. A centralized information repository facilitated effortless discovery, dissemination, and subscription to organizational knowledge assets across the entire business. Following its initial rollout, HEMA witnessed impressive metrics:
- Over 200 pieces of information were revealed to the catalog.
- Over 180 energetic subscriptions
- Over 100 energetic customers month-to-month
- More than twenty enterprises have successfully onboarded.
- Information sharing, once taking a common turnaround time of 4 working days, is now accomplished in mere seconds, without requiring input from any other personnel.
In addition, they observed considerable benefits that cannot be quantified through statistical data. As organizations gain the ability to independently discover and access information from diverse sources, they’re unlocking new use cases that were previously unknown due to a lack of transparency and insight into what other groups are producing. The information science team swiftly created a novel predictive model for sales by leveraging existing data in Amazon DataZone, rather than rebuilding it from the ground up. Established as a dynamic hub of collaboration, this group is poised to synergize efforts and co-create innovative strategies for advancing HEMA’s information endeavors.
Conclusion
Amazon DataZone enabled HEMA to successfully govern its information at scale, prompting the company to collaborate closely with AWS to introduce innovative solutions, while concurrently focusing on the ongoing implementation of existing initiatives within HEMA’s strategic roadmap. Staff continuously expands and refines services by introducing successive waves of innovative features designed to further elevate information management capabilities.
- This characteristic enables information producers to track and refine their intellectual properties, while consumers can transparently evaluate the subtleties of an asset prior to using it within their data extraction, transformation, and loading workflows.
- This feature enables users and the governing body to suggest source materials, track transformations, and monitor cross-organizational use of intellectual property.
- This feature enables producers to maintain precise control over what information is disseminated to other entities, thereby guaranteeing that only relevant assets are shared with targeted groups.
By envisioning a future where Amazon DataZone becomes the unified hub for data sharing and cataloging across the organization, HEMA’s long-term vision is crystal clear: to seamlessly integrate data assets, streamline decision-making processes, and unlock new insights that drive business growth and innovation. While currently focused on assisting teams managing ETL pipelines through Amazon DataZone, the ultimate objective is to expand its scope to support all enterprise groups working with data, ultimately aiming to streamline their daily operations seamlessly. Information is undoubtedly a valuable asset for any organisation, poised to democratise its impact by establishing an environmentally responsible information hub reliant on cutting-edge information governance technology.
In regards to the authors
is the Information & AI Governance GTM Lead for the EMEA market at AWS the place he helps prospects with their information methods beginning with sturdy information governance and makes use of his experience in end-to-end information & analytics administration. As a public speaking coach based in the Netherlands, Luis has honed his expertise in helping individuals communicate effectively, while his personal experience as a father of two grown sons, spaced 18 years apart, has granted him a unique perspective on understanding diverse viewpoints.
As a Principal Analytics Options Architect at AWS, he thrives on resolving customers’ data conundrums. With a robust foundation in analytics, distributed systems, and resource orchestration platforms, he serves as a trusted technical advisor for AWS clients.
is the Head of Information & Cloud Platforms at HEMA. He led the effort to revitalize the Information Group by building a cloud-based Information Platform, leveraging Amazon Web Services (AWS), designed to power a robust Information Mesh architecture. Tommaso drives the Answer Structure team with unbridled enthusiasm, spearheading both technological and operational endeavors alongside his passion for leading core Information Management and Information Governance projects, frequently sharing his expertise as a sought-after public speaker. Outside of the workplace, Tommaso is a dedicated stay-at-home dad who combines his passion for traveling and sports with quality time spent with his family.
As a senior information engineer at HEMA, he plays a pivotal role in driving technological advancements and streamlining data operations within the organization. He capitalizes on native AWS tools to streamline data workflows, contemporize HEMA’s informational architecture, and implement robust, scalable end-to-end data framework solutions that ensure seamless integration and reliable performance. Outside of work, he enjoys traveling, playing video games, and participating in outdoor activities.