What Is Master Data Management?
Master Data Management (MDM) is the discipline of creating and maintaining a single, authoritative, accurate, and complete version of an organisation's most critical shared data entities — known as master data. These entities typically include customers, products, suppliers, employees, locations, and accounts — the core business objects that appear across multiple systems and processes.
Without MDM, organisations suffer from a fragmented view of their most important data. A customer may be recorded differently in the CRM, the billing system, and the e-commerce platform — different names, different addresses, different IDs. This fragmentation causes duplicate communications, inaccurate analytics, poor customer experience, and regulatory compliance failures.
What Is Master Data?
Master data is the non-transactional data that provides the shared context for understanding and interpreting transactional data. It is relatively stable (changes infrequently compared to transactions), shared across multiple business processes and systems, and critical to business operations.
The classic master data domains are:
- Party data: Customers, suppliers, employees, partners
- Product data: Products, services, materials, components
- Location data: Addresses, sites, territories, regions
- Financial data: Accounts, cost centres, legal entities
The Golden Record
The golden record (also called the "single version of the truth" or "best version of the record") is the authoritative, consolidated master data record that MDM produces. It is created by identifying all records across source systems that represent the same real-world entity, and then applying survivorship rules to determine which value to use for each attribute.
For example, if a customer's name appears as "John Smith" in the CRM and "J. Smith" in the billing system, the survivorship rule might specify: "Use the CRM value if it contains a full first name, otherwise use the billing system value." The golden record for this customer would then show "John Smith".
Survivorship Rules
Survivorship rules are the business logic that determines which source system's value "survives" (is used) in the golden record when multiple systems have different values for the same attribute. Common survivorship strategies include:
- Most recent: Use the value from the system that was most recently updated
- Most trusted source: Designate one system as the system of record for each attribute and always use its value
- Most complete: Use the value that has the most information (e.g., full name over abbreviated name)
- Majority vote: If three systems agree and one disagrees, use the majority value
- Algorithmic: Apply a scoring algorithm that weights multiple factors to select the best value
The Four MDM Styles
There are four main architectural styles for implementing MDM, each with different trade-offs between implementation complexity and data control:
1. Registry Style
The MDM hub maintains a registry of cross-references between records in source systems, but does not store master data itself. When a consumer needs the golden record, the hub assembles it on-the-fly from the source systems.
Pros: Low implementation complexity, no data duplication, source systems remain authoritative. Cons: Real-time assembly can be slow; source systems must be available.
2. Consolidation Style
The MDM hub pulls data from source systems, applies matching and survivorship rules, and stores the golden record. The golden record is read-only — source systems are not updated from the hub.
Pros: Good for analytics use cases; source systems are not disrupted. Cons: Source systems still contain inconsistent data; the hub is downstream only.
3. Coexistence Style
Similar to Consolidation, but the hub publishes the golden record back to source systems. Source systems can still update data locally, but the hub reconciles and redistributes the authoritative version.
Pros: Source systems benefit from cleansed data; bi-directional flow. Cons: Complex synchronisation logic; risk of data conflicts.
4. Centralised (Transaction Hub) Style
The MDM hub is the system of record. All creates, reads, updates, and deletes for master data go through the hub. Source systems consume master data from the hub rather than maintaining their own copies.
Pros: Maximum data consistency and control; single authoritative source. Cons: High implementation complexity; requires significant organisational change management.
MDM Implementation Steps
- Define master data domains: Identify which entities are master data and prioritise by business impact.
- Profile source data: Understand the current state of master data across all source systems — quality, completeness, format variations.
- Define the canonical data model: Design the target data model for the golden record — what attributes it will contain and their definitions.
- Define matching rules: Specify the logic for identifying records that represent the same real-world entity (exact match, fuzzy match, probabilistic matching).
- Define survivorship rules: Specify which source system's value wins for each attribute when conflicts exist.
- Select MDM style and technology: Choose the architectural style and MDM platform that best fits your requirements.
- Implement, test, and deploy: Build the MDM solution, test with real data, and deploy incrementally.
- Govern ongoing: Establish stewardship processes for ongoing data quality management and exception handling.
MDM and the CDMP Exam
Master Data Management carries a 10% weighting in the CDMP exam. Key topics include: the definition of master data vs. reference data vs. transactional data, the four MDM styles and their trade-offs, the concept of the golden record, survivorship rules, and the relationship between MDM and data governance. The CDMP exam frequently tests candidates' ability to distinguish between the four MDM styles and select the most appropriate style for a given scenario.