Choose Your Reading Style
A professional-level summary covering key definitions, frameworks, and exam-relevant points.
DMBOK Context
A data catalog is a key deliverable of the Metadata Management knowledge area (11% CDMP weight). The DMBOK distinguishes between a metadata repository (the back-end store of all metadata) and a data catalog (the front-end discovery tool that makes metadata accessible to users). Both are components of a complete metadata management architecture.
Data Catalog Components
| Component | Description |
|---|---|
| Asset inventory | Searchable index of all data assets across the organisation |
| Technical metadata | Schemas, data types, row counts, system names |
| Business metadata | Definitions, owners, business rules, glossary links |
| Operational metadata | Last updated, data quality scores, usage statistics |
| Data lineage | Origin, transformation history, downstream dependencies |
| Social metadata | User ratings, comments, certified/trusted flags |
Active vs Passive Data Catalogs
A passive data catalog is manually maintained — humans update the metadata. A active data catalog automatically harvests metadata from connected systems, keeping the catalog current without manual effort. Modern enterprise data catalogs (Collibra, Alation, Informatica, Microsoft Purview) are active catalogs with automated metadata harvesting and AI-assisted classification.
CDMP Exam Relevance
Data catalog concepts appear in the Metadata Management domain. Key exam topics: the distinction between a data catalog and a data dictionary, the difference between a metadata repository and a data catalog, the types of metadata stored in a catalog, and the role of a data catalog in supporting data governance and self-service analytics.