In some cases, one may wish to use the Repository to maintain a version history for each harvest or import of a model. These versions are individual objects within the Repository and represent the object's contents at a specific point in time.
The Administrator may manage any number of versions. By default, however, the UI only shows one version of a model. The technical user may use the MANAGE > Repository function to view a multi-version user interface (Tree) mode at any time. Also, a particular version may be designated to be the default version. It is the default version of a model which is used when the metadata manager UI is in single-version mode.
In addition, when including a model in a configuration, one is really including a specific version of that model in the configuration. This means that one may control which versions of which models are to be published at any point in time. E.g., one may place the approved version of a model in a published configuration while data modelers continue to edit and upload newer version as work in progress (unpublished).
Otherwise, users are restricted to a single configuration at a time, and thus in those cases the UI only shows one version of a model.
Add a New Version to a Model
Adding a new version of a model involves
-
For Imported Models by harvesting or importing the model
-
For custom models creating a new model (and thus version) in MANAGE > Configuration or MANAGE > Repository
When a new version is created, it is given a default name which is a text string based upon the UTC date time that the version was created. See rules for date assignments.
For configurations, the process is different. You create new versions by going to MANAGE > Repository and right-click the model and pick an option to create a new version
Review Import Log
Steps
-
Sign in as a user with at least the Metadata Management capability object role assignment on the model.
-
Either
-
Go to MANAGE > Configuration in the banner. Select the model. Click on the Versions tab.
-
Go to MANAGE > Repository. Expand the tree to see the model and expand the model to see its versions.
-
As a model may be imported several times, select the log for the specific date and time of the import in question and click on View Log.
Publish a Version of a Model
When including a model in a configuration, one is really including a specific version of that model in the configuration. This means that one may control which versions of which models are to be published at any point in time. E.g., one may place the approved version of a model in a published configuration while data modelers continue to edit and upload newer version as work in progress (unpublished).
To publish then means to add the version of the model to the published version of a configuration.
Steps
-
Sign in as a user with at least the Metadata Editing capability object role assignment on the configuration.
-
Go to MANAGE > Repository.
-
Right-click the Published version of the configuration and select Switch to this configuration.
Be sure the Repository Panel is filtered to enable showing of versions.
-
Click the plus sign and select Existing model for the models you wish to Publish.
-
You may now view the model by default when signed in to that configuration.
Model Properties
Because repository level objects in MetaKarta have versions, properties may be specified for the model as a whole, and/or to versions of that model. A good example is Version Name, which is the name of particular version. In general, the UI makes this transparent to the user and it is not a concern. However, it is possible to use the repository manager to isolate particular versions independently of the configuration version you are currently working in, and thus set their properties. The result may be confusing unless you are careful.
Example
Sign in as Administrator and go to MANAGE > Repository. Then navigate to the Staging DW model.
Add a Label and click SAVE.
Expand the plus sign and click on the version of that model.
The label applies only to the model as a whole, not specific versions.
Click the Staging DW model and go to the Responsibilities tab and assign the Steward role to Stu.
Internal and External Model Properties
Models and other repository object have internal and external properties (and relationships). The external properties may only be set from the Repository Manager. Otherwise, when editing in a worksheet or the object page of any repository level object, you are affecting the internal properties of that object.
From the previous steps, now search for that Label by clicking on the quick search text box (upper right) and pressing ENTER. Then clicking +FILTER and add a Labels filter.
The label "Testing" cannot be filtered for as it is a repository (external) model property. This is because it is not an internal property (it was assigned in the Repository Manager to the model itself.
Now, go back to the Repository Manager tree, right-click on the Staging DW model and select Open to go to the object page for the model (internal).
"Testing" is not a label for this object because we are now looking at the internal properties of the model.
So, let's add it.
Testing is not even in the namespace, as we are only seeing labels defined for internal objects.
After setting the label on the internal properties for the model, when we again search, we see:
Now, we see the label "Testing" in the namespace and obtain the proper result.
Let's check the other property we added in the Repository Manager, i.e., the Steward role. Again, let's search for that role defined:
The assignment of responsibilities is different from labels, and what you assign in the Repository Manager to the model externally is also inherited to the objects contained within (e.g., the internal model properties). Thus, the search returns works and the Steward responsibility role assignment is shown internally.
Make a Version of a Model the Default
The default version of a model is the version that is used when an action is taken against the model as a whole. For example, if one opens a model but not a specific version, then the default version is opened.
Steps
-
Sign in as a user with at least the Metadata Management capability object role assignment on the model.
-
Go to MANAGE > Repository.
Be sure the Repository Panel is filtered to enable showing of versions.
- Right-click on the version of the model in the Repository Panel and select Set to default.
The previous default version will no longer be default.
Delete Unused Versions
As new versions of models and physical data models are harvested, they begin to collect. Most of these "historical" versions are of no value to keep around. Also, they consume resources, such as disk space, index size, performance of search, etc.
MetaKarta provides a tool to manage and eliminate these older versions. There is a script operation named Delete unused versions. When run, if a version of a model is not in any current configuration version, and it is more than one hour old, it will be deleted.
No model that has been imported in the last hour will be deleted by Delete unused versions as it may yet be included in a configuration in the near future as part of the harvesting and scheduled update process.
Thus used refers to containership in a configuration ONLY. No other relationships will prevent the purging of a model version which is not a member of a configuration version.
Steps
Manual Delete
-
Go to MANAGE > Repository.
-
Right-click on the Repository as a whole, a folder or a model and select More... > Operations > Delete unused versions.
Be sure to Show versions if necessary.
Scheduled Delete
In order to manage the version overload in an automated fashion, you may schedule the task.
-
Go to MANAGE > Schedules.
-
Provide the required information as for any scheduled task.
-
Specify the Delete unused versions action.
You may use the View Links option on a version of a model to determine if it is a member of a configuration.
Example
Import on the Staging DW model.
After the import has finished note that there are two versions of the Staging DW.
Go to MANAGE > Repository.
Be sure the Repository Panel is filtered to enable showing of versions.
Right-click on the second version of the Staging DW model and select View Links and note that it is not a member of any configuration version and is thus a candidate to be purged.
Wait more than one hour.
Right-click on the Data Warehouse folder and select More... > Operations > Delete unused versions. Click Run Operation.
Note, the log reports one model version deleted and the version is no longer in the Repository panel.
Retain Maximum Versions
Multi-models or Catalogs of Models
A model in the MetaKarta repository is a word with multiple layers. For an imported model, a model is the minimal unit of change when importing or re-importing (versioning due to import), including in a configuration, or referencing in a semantic mapping (as source or target). This is the repository object that one sees in the Repository Manager tree (with its associated versions) and the Configuration Manager list.
A model may consist of only one self-contained model, such as that for a logical data model, a DDL import, or individual CSV or JSON file import. It is a very straightforward arrangement where one model (import scope) includes one organizational (internal) model.
However, in many cases, the import scope will include multiple self-contained models. E.g., an import from a data modeling tool with a logical and multiple related physical models, a RDBMS source with multiple databases and/or schemas (as self-contained models) organized by a catalog (another self-contained model), a file system bridge which imports multiple files (self-contained models) organized in a hierarchy or directory structure (another self-contained model). In these scenarios, a given import produced many models and a directory or catalog model contained comprising what is referred to as a multi-model.
Multi-models are very convenient, as they allow for:
-
Incremental harvesting of only those databases, schemas, packages, reports or files which have changed since the last harvest, dramatically improving import performance.
-
The repository needs only to store what has changed. In this way, if only one or two of these contained model requires to be created because those are the only portions which changed in the entire scope of import, then that is all that the repository needs to store.
In this way, multi-models are highly efficient. In particular, when storing a multi-model MetaKarta attempts to minimize the number of version of individual contained models, reusing them to the best of its ability, thus only retaining the subset required to "reconstitute" and present the specific model versions (imported scope) that are harvested and currently maintained. In order to store a multi-model, then, after the multi-model is harvested (itself a process of only importing new versions of contained models if they are different than the cache of the last import) each contained model is checked for any difference with any previously harvested (and still stored) version of that same contained model already in the repository. If there already exists such a version of the contained model, then the full multi-model version will be associated with that existing contained model version. If the imported contained models is truly new, then a new version of that contained model is written to the repository and the full multi-model version will be associated with that new contained model version
In order to present a proper version of a multi-model, then, any search results, browsing, API results, etc., are all based upon which versions of the contained models are associated with the version of the imported model that is a part of the current configuration. It sounds a bit complex, but that complexity is well managed by the backend and thus transparent to the user experience.
For example, in the picture above, we have four different harvests of an imported model (a relational database in this illustration), and only some of the schemas prove to be updated at each harvest.
For the first harvest on 1 May 2023, we see there are three schemas (1, 2 and 3), and as this is the first harvest all three become versions of contained models.
For the second harvest on 15 May 2023, we see:
-
Schema1 is unchanged and thus no new contained model version is created for it.
-
Schema2 has changes from the previous version imported and thus a new version of the contained model for schema2 is imported and included in the new version of the imported model.
-
Schema3 has been deleted and thus is no longer included in the new version of the imported model.
For the third harvest on 30 May 2023, we see:
-
Schema1 has changes from the previous version imported and thus a new version of the contained model for schema1 is imported and included in the new version of the imported model.
-
Schema2 is unchanged and thus no new contained model version is created for it.
-
Schema4 has been added to the database, and a first new version of the contained model for schema4 is imported and included in the new version of the imported model.
For the fourth harvest on 30 May 2023, we see:
-
Schema1 has changes from the previous version, but these changes merely restore it to the 20230501 version of that contained model. Thus, we have a case where the schema is reverted back to an earlier state and that earlier version is then included in the new version of the imported model.
-
Schema2 has changes from the previous version imported and thus a new version of the contained model for schema2 is imported and included in the new version of the imported model.
-
Schema4 is unchanged and thus no new contained model version is created for it.
Reuse of older versions of contained models because of reverted changes to the harvested model are common and implemented in the most efficient means by repointing back to that earlier state.
Date Properties and Versions
**** | Imported Object(e.g. Schema, Table, Column) | Custom Object(e.g. Term) |
---|---|---|
Created Date | Created date in the source system for some model, such as when the database DDL’s CREATE TABLE statement was executed, note that:Created Date is not available on all imported tool technologies, and therefore may not be set.All columns have the same Created Date, as the Table they belong to, as all databases do not track column level changes. | MetaKarta, and not just the time model it is contained in was created. |
Modified Date | When the object was last modified in the source system, such as when the last database DDL’s ALTER TABLE statement was executed. Note that:Modified Date is not available on all imported tool technologies, and therefore may not be set.If a table has not been modified, the Modified date is the same as the Created Date (to avoid double query).All columns of a given table have the same Modified Date as the Table they belong to, as all databases do not track column level changes.Modified Date are populated from properties that might have slightly different names in the imported tool such as Snowflake table last altered time. | Not Applicable |
Imported Date | The last time the object was imported (possibly after incremental harvesting), which might be newer than the Created or Modified Date. This date is critical to assess the freshness / accuracy of the Created and Modified Dates, when the import is manual, or on a slow pace weekly / monthly schedule. | Not Applicable |
Updated Date | The last time any user changed the object. This included any attribute change or relationship change. Last date recorded in the audit log. | The last time any user changed the object. This included any attribute change or relationship change. Last date recorded in the audit log. |
Viewed Date | The last time the current user visited any tab on the object page for this object. | The last time the current user visited any tab on the object page for this object. |
> Viewed Date is a slightly special field, as it is updated each time an | ||
> object is viewed via is object page. Thus, placing it on the object page | ||
> has little value, as the result would always be "now". |
However, it does make an effective result column as part of a worksheet, as below:
Propagate Documentation (Option Deprecated)
MetaKarta enables the propagation of documentation (including Name and Business Definition, diagrams, join relationships and custom attributes) from older versions of a model. This propagation of the documentation is now always enabled.
When you check the Propagate documentation checkbox, MetaKarta does not change behavior, and documentation changes made to any historical version are still propagted to new versions. If you leave this checkbox unchecked, the same behavior is true.
Basically, the behavior is that a change to a version applies to any later versions. However, there is an exception when dealing with relationships from or to the imported model and its contained objects.
When the latest version of the model already contains the proper change, going to an older version that does not have the change and updating it can have unintended side effects. For relationships this is a bit tricky as one must deal with cardinality as well. So in the case where the link is unary, the latest version has the unary link to a particular object, and trying to create the unary link in the older version to a different object means deleting the existing link in the latest version. In this case
-
When these versions have the link we create the link in the old version but not propagate it to newer versions.
-
Otherwise we propagate it to newer versions when it is unavailable there.
This feature can produce conflicts and unexpected results. For example, when you edit the same piece of documentation in different versions the latest edit wins. It is true even when you edit the latest version first and an old version later. You should disable the feature after you finish making future-proof changes to older versions to avoid unnecessary conflicts.