Reece Group Introduces New Data Platform Enhancing Security and Self-Serve Access
Reece Group, a prominent seller of plumbing and bathroom supplies, has recently implemented a "next-generation data platform." This platform aims to facilitate secure and controlled self-service access to data across various departments and teams within the organization.Erik Pak, Applied Data Science Principal at Reece Group, shared during the Databricks Data Intelligence Day in Melbourne that the company had previously relied heavily on a centralized Business Intelligence (BI) team to carry out analytics and reporting tasks."As an enterprise-level retail company, we have a centralized BI team, similar to many other companies in our industry, and naturally, we encounter many common challenges," Pak remarked.
Pak emphasized some of the recurrent obstacles encountered by Reece Group's BI team. These include the insufficient capacity of the BI team to manage all the BI projects across the company effectively. Additionally, the data warehouse often struggles to accommodate the vast amount of data required by various departments simultaneously, leading to inefficiencies.Furthermore, Pak noted the challenge posed by long-running queries from certain users, which can negatively impact the performance of the system for others. This issue is particularly concerning for overnight jobs, as any disruption could potentially disrupt critical processes.
Pak highlighted Reece Group's aim to encourage departments and teams to take greater ownership of data within the data platform, beyond solely within the source systems. This shift towards ownership involved enabling a higher degree of self-service for data management.In line with the importance of data security, Reece placed a strong emphasis on establishing a data platform that offered physical dataset and compute isolation. Additionally, the platform needed to facilitate easy management of access control, allowing departments to grant partial or full access to their data to other departments or teams.
Pak emphasized the need for a user-friendly mechanism for data owners to manage access control, underscoring the importance of empowering data owners to take ownership and responsibility for managing their data within the platform.To facilitate this approach, Reece Group established Databricks within an AWS account managed by the data platform team. Subsequently, a Databricks workspace was set up for each department, with dedicated cloud storage and compute capacity allocated to them within their respective AWS accounts.Regarding compute resources, each workspace was equipped with a shared cluster and shared SQL warehouse at the departmental level.
Depending on the usage patterns of a particular team, Reece might also create team-based clusters to meet their specific requirements, thereby optimizing resource allocation based on usage levels.For certain specific use cases or scenarios, individual users may require their own dedicated compute resources, and Reece Group accommodates such requests on an ad hoc basis.Moreover, the organization has implemented serverless SQL warehouses for certain query processing tasks, which has had a positive impact on user experience and cost efficiency.With the Databricks workspaces established, Reece utilizes Unity Catalog to facilitate data access requests between different workspaces.
Unity Catalog offers centralized access control, auditing, lineage, and data discovery capabilities across Databricks workspaces, allowing data owners to grant access to their data to users from other workspaces, including specifying access to only a portion of the data.This approach has enabled Reece to achieve its goals of enhancing data security, promoting user self-service, and fostering decentralized responsibility and accountability for data access.
Pak emphasized the importance of viewing data as a product within organizations and expanding the responsibility of data ownership beyond maintaining the source system to encompassing how data is utilized across the organization. This shift can help alleviate the workload of BI teams, allowing them to focus on more strategic initiatives.
Related Courses and Certification
Also Online IT Certification Courses & Online Technical Certificate Programs