With the aim of enhancing supply chain data quality, Elanco team conducted a series of brainstorming sessions, which was focused on two main aspects: data quality rule creation and the data quality improvement process. A data warehouse (Web-application integrated with data warehouse & Interactive dashboard) was established, leveraging SAP and Azure resources through the utilization of Delta live tables. Here’s presenting their success story…
ELANCO is an Animal Health organization, boasting an extensive portfolio of over 6800+ SKU-Market Combinations. With a widespread presence spanning manufacturing units and sales operations, both internally and externally, Elanco operates on a global scale. To ensure the integrity of critical data stored in SAP tables, maintaining millions of entries across key dimensions, including accuracy, completeness, and integrity, is of utmost importance. Managing the vast supply chain master data entails overseeing hundreds of fields, each governed by specific rules and interdependencies. However, due to limitations in data quality monitoring options, it becomes challenging to identify and program all the necessary rules. Consequently, the time and efforts required to conduct regular reviews of the data quality is significantly high.
Elanco follows the IPP (Innovation, Product & Portfolio) strategy, which encompasses a comprehensive approach in addressing the evolving needs of customers in the Animal Health landscape and delivering innovative solutions leveraging products, processes, and cutting-edge technologies to enhance animal well-being. One of the key successful transformation stories of Elanco supply chain team is Material Master Data Miner, a robust data quality management system. It's a unique and first of its kind ML based application, which not only measures the current quality but also helps to identify discrepancies based on the rules among them.
CHALLENGES
Unable to measure Data Quality and Identify Problematic Entries: Existing approach of relying on BO reports or manual analysis on local machines falls short in providing a comprehensive solution for assessing data quality. This approach not only lacks a holistic view but also demands significant time investments.
Limited Agility in Adapting to New Rules: The current process of creating and implementing rules in existing system is time-consuming. Given the importance of regularly evaluating data, a more agile approach is crucial to swiftly respond to new rules and requirements.
Absence of a Process to Rectify Data Entries Resulting from Knowledge Gaps: A notable gap exists in terms of a structured process for correcting data entries that are created due to a lack of knowledge regarding rule creation. This gap hinders the accuracy and reliability of the data.
Investigating New Rules and Relationships within a Vast Data Set: Manual detection of rules and relationships among data fields poses an enormous challenge for data stewards, given the exceptionally large volume of data. The complexity and scale make it nearly impossible to accomplish this task manually.
Lack of a Standardized Approach to Data Quality Improvement: Establishing a standardized approach encompassing dedicated functionalities, from data creation to rule creation and maintenance, is crucial for fostering continuous improvement in data quality. Such an approach would streamline the entire process and facilitate consistent enhancements in data quality throughout the organization.
APPROACH
With the aim of enhancing supply chain data quality, a series of brainstorming sessions were conducted involving industry experts and senior leadership, leading to the creation of user stories. The initiative focuses on two main aspects: data quality rule creation and the data quality improvement process.
A data warehouse was established, leveraging SAP and Azure resources through the utilization of Delta live tables. In Azure, components such as data bricks, storage containers, and data factory were implemented for rule creation by identifying relationships among data points. These resources were seamlessly integrated with a web-based frontend application equipped with features enabling the creation, modification, editing, activation, and deactivation of rules. ML metrics are employed to prioritize rules based on model validation metrics such as lift, support, and confidence, while also aiding in identifying outliers and understanding the relationships between data fields.
Users are empowered to define user-defined rules specific to tables and measure field quality, facilitating the correction of data entries. To gain insights over tables with poor data quality, and problematic data entries, a live Microsoft Power BI dashboard was implemented. Key performance indicators (KPIs) on the dashboard provide valuable information on dimensions such as accuracy, completeness, and integrity. The dashboard's outputs are used for establishing a data quality improvement process, enabling the creation of an action plan. Data owners and stewards collaborate closely to rectify problematic entries, thereby improving overall data quality through a continuous improvement approach. Additionally, Ad-hoc periodical rule mining jobs can be scheduled using the frontend application. This comprehensive approach empowers Elanco to proactively address data quality concerns, ensure efficient decision-making, and drive continuous enhancements across their supply chain.
OUTCOME
As mentioned above, Material master data miner application is developed with key features which helps in:
Creating new rules - Each network team would have a network data steward (NDS) and data owners who is trained to run the rule mining engine to investigate new rules. These same people would have the ability to create user-defined rules. We would need to have governance overrule activation / change as what might seem like a great rule for one network might completely break another. The owner role would approve new rules / rule changes, ensuring they truly are written correctly and are suitable to apply globally. This helps us to create and implement new rules among data points and identify the outliers. The outliers can be analysed using PowerBl detailing why it is failed a particular rule and action plan to be developed. Rules can be activated/ deactivated/edited as per the data owners’ / stewards’ requirements.
Driving Data Quality – As a continuous improvement initiative, network team would have an NDS who takes ownership for the failed records for their network / affiliates, works with the ‘owner’ to ensure these really are problem records (not a rule problem), and then works with their team to get them fixed. NDS/ owners will run the scheduled mining jobs on periodic basis on combinations of data fields or tables or merge multiple tables into one to investigate new rules and real-time outlier identification and data correction. Material master data quality is improved to 96% for the tables in-scope post implementation of the application.