Skip to content

Managing Data Products

Overview

The ability to define, use and maintain data products is a very useful feature within MetaKarta. It is one of the primary features of the larger data mesh architecture approach. Depending on the nature of the data within a domain (known as domain data) and its consumption models, data can be served as events, batch files, JDBC relational tables, graphs, etc., while maintaining the same semantics (e.g, meaning and utilization).

Domains and the Data Products Model

You must first create a data products model as a custom model in order to store and manage the data products, data domains, ports, usage requests, etc.

In this way,

  • A data products model may contain one or more data domains, and these data domains have a hierarchical structure with a particular data products model.
  • Each data model then may contain
  • One or more data products. Each data product may contain
    • One or more ports, which are data sources. Each port may be associated with any object in the repository and will therefore be represented by that object's metadata (e.g., table and column specifications, file format, etc.)
    • One or more usage requests
  • One or more data contracts, each of which may be associated with up to one data product

Create and Edit a Data Product

Once you have created a data products model, you may add a hierarchy of data domains to it.

You must have at least one data domain in order to create data products and data contracts.

Steps

  1. Sign in as a user with at least the Metadata Management capability object role assignment on the data products model in which you wish to edit.
  2. Navigate to the model and click the Overview tab.
  3. Navigate to the Data Domain in which you wish to include the new data product.
  4. Click the + CREATE to the right of the Data Products section.
  5. Enter a NAME and DESCRIPTION for the data product and click OK.

There may be workflow enabled available for custom models. Please refer to the specific examples for glossaries (a type of custom model) for full details on how to set up and use workflow with a custom model.

Example

Sign in as Administrator and navigate to the object page for the My Company Data Products model.

Click Finance in the Data Domains section to open that data domain.

Edit a Data Product Properties

Click the + CREATE to the right of the Data Products section

and enter "Cloud DW Finance Customer" in the NAME and "Cloud DW Finance Customer data product in the User Guide" in the DESCRIPTION.

Click OK.

Delete a Data Product

Hover the mouse over the new Cloud DW Finance Customer data product and click the red x.

Edit a Data Product

In addition to the Name and Description, a data product may have:

Data Product Ports and Lineage

Ports provide a way to associate any collection of repository objects with a data product. In particular, in the case of imported data model elements, e.g. schemas, tables, columns, they act as inputs to the data product and thus define its scope in terms of data flow lineage.

A data product has ports with objects and can consume “output” components of other products as port objects

There are several types of ports:

  • Input port(s): an input port describes a set of services exposed by a data product to collect its source data and makes it available for further internal transformation. An input port can receive data from one or more upstream sources in a push (i.e. asynchronous subscription) or pop mode (i.e. synchronous query). Each data product may have one or more input ports.
  • Output port(s): an output port describes a set of services exposed by a data product to share the generated data in a way that can be understood and trusted. Each data product may have one or more output ports.
  • Discovery port(s): a discovery port describes a set of services exposed by a data product to provide information about its static role in the overall architecture like purpose, structure, location, etc. Each data product may have one or multiple discovery ports.
  • Observability port(s): an observability port describes a set of services exposed by a data product to provide information about its dynamic behavior in the overall architecture like logs, traces, audit trails, metrics, etc. Each data product may have one or more observability ports.
  • Control port(s): a control port describes a set of services exposed by a data product to configure local policies or perform highly privileged governance operations. Each data product may have one or more control ports.

One may also use the Lineage tab with a data produce to obtain a simplified business view of product lineage can help you quickly grasp the provenance and relationships of your data products.

  • Data producers understand how their data products are used within the organization.
  • Data consumers gain visibility into the provenance of the data products they use.
  • Producers and consumers better understand data's business relevance and impact.

Technically, the product lineage hides/summarizes in the lineage graph any non-product objects (not contained in one of the ports). . The product lineage depicts these input/output relationships between products. This diagram has the potential to depict the whole product data estate of an organization. The product lineage can show lineage within the same domain or across different domains. A data domain has a designated color. The product lineage depicts these colors.

Create a Data Product Port

Once you have created a data product, you may add a flat list of ports to it.

Data product ports are directly contained within that data product. You may have any number of ports containing any number of repository objects referenced.

Steps

  1. Sign in as a user with at least the Metadata Management capability object role assignment on the data products model in which you wish to edit.
  2. Navigate to the model and click the Overview tab.
  3. Navigate to the Data Domain and then Data Product in which you wish to include the new data product ports.
  4. Click the + CREATE to the right of the Ports section.
  5. Enter a NAME and DESCRIPTION for the data product and click OK.

There may be workflow enabled available for custom models. Please refer to the specific examples for glossaries (a type of custom model) for full details on how to set up and use workflow with a custom model.

Example

Sign in as Administrator and navigate to the object page for the My Company Data Products model.

Click Finance in the Data Domains section to open that data domain.

Click the Cloud DW Finance Customer data product.

Edit Data Product Port Properties

Click the + CREATE to the right of the Ports section and enter "Cloud DW Snowflake Tables" in the NAME and "Cloud DW Finance Customer data product port in the User Guide" in the DESCRIPTION. Also, enter "Output" the DATA PRODUCT PORT TYPE

Click OK.

Delete a Data Product Port

Hover the mouse over the new Cloud DW Snowflake Tables data product port and click the red x.

Edit a Data Product Port

At this stage, one may pick repository objects to be included in your Port. In this case, we will pick some tables. For this activity, we will use the Represents links in a port.

Steps

  1. Sign in as a user with at least the Metadata Management capability object role assignment on the data products model in which you wish to edit.
  2. Navigate to the model and click the Overview tab.
  3. Navigate to the Data Domain and then Data Product and then Data Product Port in which you wish to include the new data product port objects.
  4. Click the + ADD to the right of the Represents section.
  5. Enter a NAME and DESCRIPTION for the data product and click OK.

There may be workflow enabled available for custom models. Please refer to the specific examples for glossaries (a type of custom model) for full details on how to set up and use workflow with a custom model.

Example

Sign in as Administrator and navigate to the object page for the My Company Data Products model, the Finance data domain, Cloud DW Finance Customer data product and ultimately the Cloud DW Snowflake Tables data product port.

Click the + ADD to the right of the Represents section and specify Database > Tables as the CATEGORY of objects to select.

Then click + FILTER and filter on Model (scope)

and select the Cloud DW Snowflake model:

Then select these tables:

  • Cloud DW Snowflake > DEMO > CUSTOMER
  • Cloud DW Snowflake > DEMO > CUSTOMER_PAYMENT_DATE
  • Cloud DW Snowflake > DEMO > CUSTOMER_PO_DATE
  • Cloud DW Snowflake > DEMO > CUSTOMER_PO_INVOICE_ITEM
  • Cloud DW Snowflake > DEMO > GL_ACCOUNT

And the result:

Data Product Score

Product scores help build trust in data products.

Data products need to have the following basic qualities:

  • Understandable
  • Addressable
  • Secure
  • Interoperable
  • Quality

The UI shows the product score as a gauge.

Based on the qualities of data as a product, product scores can help you signal the accuracy and completeness of your data products, helping build trust in them.

MetaKarta calculates and assigns a product score to your data products based on preset metadata completeness and data quality criteria. MetaKarta evaluates metadata enrichment on the data product, components, or both. Using a weighted scoring method, values are automatically assigned based on how well your data product satisfies the five qualities of data as a product. MetaKartascores your data product on a percentage basis.

You can instruct MetaKarta to ignore an individual quality score by setting its weight to 0.

The six qualities of data as a product are:

Understandable

A data product is considered understandable when it has the necessary documentation to help data consumers better understand it. MetaKarta quantifies how understandable a product is by the presence of contract and documentation (definitions and is defined by terms) on the data product and its components.

Addressable

A data product is considered addressable when it has well-documented owners. MetaKartaquantifies how addressable a data product is in terms of owners assigned to the data product and its components.

Secure

A secure data product will signify the sensitivity of data. MetaKartaquantifies how secure a data product is based on sensitivity classifications on the product and tags attached to its components.

For example, a product has

  • 0 security when it and its components do not have sensitivity classifications
  • mid security when
  • it does not have a sensitivity classification, but its components have
  • it has a sensitivity classification, but some of its components do not
  • high security when it and the majority of its components have the same sensitivity classification
Interoperable

The interoperability of a data product is determined based on the visibility and completeness of technical lineage between data assets. MetaKarta quantifies how Interoperable a data product is by the percentage of components with downstream lineage links.

Quality

A data product's data quality is determined based on the data quality scores of its components.

Updating the Data Product Score

Data product quality scores and the computed overall score are recomputed on demand (or on a schedule) by defining a scheduled operation.

Steps

  1. Sign in as a user with at least the Application Administrator capability global role assignment.

  2. Go to MANAGE > Schedules in the banner.

  3. Give the scheduled OPERATION a NAME.

  4. Choose an OBJECT from the repository on which you will schedule the update action.

  5. OPERATION: Pick the Update data products operation.

Example

Sign in as the Administrator user. Go to MANAGE > Schedules in the banner. Click + Add and enter values as below:

Click SAVE.

To run the schedule immediately, right-click on the scheduled operation just created and select Run Operation Now.

Now, returning to the data product Overview tab, we have scores: