Table of Contents
- Overview
- Copyright Notice
- Release Changes
- System requirements
- Database Server Setup
-
Application Server Setup
- Application Server Installation and Configuration
- Application Server Upgrade
- Application Server Execution and Initialization
- Custom integration with Authentication Environments
- Custom integration for Secure Socket Layer (SSL) communication
- Security and Vulnerability Considerations
- Lucene Search Engine Troubleshooting
- High Availability Considerations
- Metadata Harvesting Model Bridge (MIMB) Setup
- User Interface Look & Feel Customization
- REST API SDK
- Database Server Backup/Restore
1. Overview
based on the Meta Integration® Repository (MIR) for metadata storage (in a database server),
and the Meta Integration® Model Bridge (MIMB) middleware for metadata harvesting.
2. Copyright Notice
Copyright © Meta Integration Technology, Inc. 1997-2023.
All Rights Reserved.
Meta Integration® is a registered trademark of Meta Integration Technology, Inc.
Other product and company names (or logos) mentioned herein may be the trademarks of their respective owners.
http://www.metaintegration.com
3. Release Changes
v11.2.0 (v11 2023 Winter update scheduled for 10/31/2023)
-
NEW FEATURE OVERVIEW
This update is focused on supporting MM and MIMB product installers for major public clouds (AWS, Azure, and GCP). These installers deploy products as Docker containers in Kubernetes clusters. Traditional Linux and Windows ZIP installers are still available.
In addition, the business user search experience has be improved. -
NEW NATIVE CLOUD DEPLOYMENTS
-
NEW Kubernetes based MIMB Software as a Service (SaaS on AWS, Azure, and GCP)
-
NEW Kubernetes based MIMM Software as a Service (SaaS on AWS, Azure, and GCP)
-
NEW Solr Cloud SaaS index replaces local Lucene
-
NEW Kubernetes based MIMB Software as a Service (SaaS on AWS, Azure, and GCP)
-
IMPROVED SEARCH USER EXPERIENCE
Details to be announced.
v11.1.0 (v11 2023 Summer update scheduled for Beta 06/30/2023 and GA 08/31/2023)
-
NEW FEATURE OVERVIEW
This update is focused on new business user features, like user collaboration, data documentation, business information diagramming, sharing, ownership, and search. They involve new Articles and Diagrams, as well as improvements on existing (Collections, Worksheets, Dashboards, Presentations) capabilities. These new features are critical for data shopping, data trust, and data health applications.
In addition, new usage analytics capabilities have been added to examine user growth, user search popularity, object inventory growth, glossary growth, documentation coverage, data classification growth, data lineage coverage, user collaboration growth, etc. -
IMPROVED DATA DOCUMENTATION
The vocabulary used in the data documentation process had several use cases of "definition", "description", etc., that behaved differently in the system and were difficult to search across or understand the results in worksheets with different types of objects. This vocabulary has been simplified and harmonized with the following attributes:- "Name" is unchanged as the physical name (e.g. CUST) of an imported object (e.g. Table) or the actual name of a custom object (e.g. Term).
- "Business Name" is unchanged as the data documented logical name (e.g. Customer) of an imported object (e.g. Table).
- "Definition" replaces and merges "Description" and "Business Description" as the short un-formatted text that defines any object (i.e. can be used as tooltip).
- "Source System Definition" on imported objects replaces any use of "Definition", "Description", or "Comment" in the source system metamodel (profile) imported from data modeling, data integration, and business intelligence tools.
- "Description" replaces "Long Description" as HTML formatted text of unlimited length that can include images, tables, etc.
Consequently, the default data documentation attributes are defined as follows:
- Any imported object has a Name, a Business Name and a Definition available by default and may also have a Source System Definition, but will not have a Description (Administrators may add it in Manage Metamodel).
- Any custom object only has a Name by default, but does not have a Business Name, Definition or Description (Administrators may add them in Manage Metamodel).
Note that imported models from logical/physical data modeling tools (e.g. Erwin) have imported objects (e.g. table) that may have both:
- a logical definition of that table called "Definition" or Description" which are now called "Source System Definition",
- a physical definition of that table called "Comment" which comes from the SQL COMMENT concept.
-
NEW DOCUMENTATION EDITOR
The user experience of editing descriptions, comments, or the new articles and issues (see below) has been dramatically enhanced with a brand new bundled WYSIWYG (What You See Is What You Get) HTML editor. This editor is available for any custom attribute of HTML data type. This editor brings the equivalent of Google Doc or Microsoft Word within this web application, including all the usual text formatting capabilities, image management, and even copy/paste with formatting from Word or HTML pages. -
NEW DOCUMENTATION SUPPORT FOR OBJECT AND USER MENTIONS
In addition, the above newly bundled WYSIWYG HTML editor (of descriptions, comments, and articles) has been enhanced to support mentions to users (e.g. @John) and objects (e.g. @Customer). Users creating new object or user mentions benefit from automatic assistance to auto-complete or more sophisticated search to find the right user or object. Existing mentions are automatically maintained within the documentation upon any renaming of the mentioned object or user. -
NEW ARTICLE OBJECT TYPE
Descriptions can be associated with any harvested object (e.g. imported table) or custom object (e.g. a glossary term). They now benefit from the above new bundled WYSIWYG HTML editor with object and user mentions, but are not intended to be full length documents.Articles are designed for business users to develop and collaborate on any kind of documents such as review reports, change requests, white papers, user guides, overviews, etc. Articles are implemented by a new predefined object "Article" with an predefined attribute "content" of HTML data type.
A new pre-installed "Standard Extension Articles" package allows users to create new models of type "Articles" which contains the Article object type (just like Glossary contains Terms). Manage Metamodel allows one to extend the Article object type with custom attributes or links to other custom objects. Articles benefit from the same capabilities as any other custom objects including search (MQL), security, as well as the ability to have comments, mentions, and even may operate under workflow.
-
NEW ISSUE OBJECT TYPE
An Issue has an HTML based rich text formatted description that can contain images, tables, and even mentions of users and objects. An Issue also has the classic attributes (e.g. Status, Priority, Assignee, Reporter) and the relationships (e.g. Blocks, Related To, Duplicates) commonly used by issue tracking systems such as Atlassian JIRA.A new pre-installed "Standard Extension Issues" package allows users to create new models of type "Issues" which contains the Issue object type (just like Glossary contains Terms). Manage Metamodel allows one to extend the Article object type with custom attributes or links to other custom objects. Issues benefit from the same capabilities as any other custom objects including search (MQL), security, as well as the ability to have comments, and even operate under workflow.
-
NEW (MANAGE METAMODEL) STANDARD OBJECT TYPES
The standard package offers additional predefined object types in order to model the existing Data Mappings, Semantic Mappings, and the new generation Data Models as object types, including:-
New relationships as object types (also known as n-ary relationships in ER modeling or relationship as class in UML) which can have attributes, including:
- "Binary Relationship" object type connecting only two objects at the instance level (with subtypes such as the new "Semantic Link" object type)
- "N-ary Relationship" object type connecting more two objects at the instance level (with subtypes such as the new "Classifier Map" object type)
-
New root abstract objects types (required as source/target of relationships that can apply to any repository object), including:
- "Any Object" abstract object type represents any standard, custom or imported object type (as used in the Defines/Is Defined relationship on the new Semantic Link object type).
- "Any Imported Object" abstract object type is a (virtual) subtype of "Any Object" representing only Imported Objects created by import bridges.
-
New base objects types (required for data mappings), including:
- "Any Classifier" object type represents any database table, file system file, etc. (as used in the source/target relationships on the new Classifier Map and Feature Map object types).
- "Any Feature" object type represents any table column, file field, etc. (as used in the source/target relationship on the new Feature Map object type).
-
New relationships as object types (also known as n-ary relationships in ER modeling or relationship as class in UML) which can have attributes, including:
-
NEW DATA MAPPING OBJECT TYPES
Data Mappings are now modeled as objects as instances of the new "Data Mapping" model type (in Manage Metamodel) which includes new object types: Data Mapping Folder, Classifier Map (with subtypes: Bulk Mapping and Query Mapping), and Feature Map.
These new data mapping objects benefit from the same capabilities as any other custom objects including search (MQL), security, as well as the ability to have comments, and even operate under workflow. -
NEW SEMANTIC MAPPING OBJECT TYPES
Semantic Mappings are now modeled as objects as instances of the new "Semantic Model" model type (in Manage Metamodel) which includes a new Semantic Link object type.
These semantic link objects benefit from the same capabilities as any other custom objects including search (MQL), security, as well as the ability to have comments, and even operate under workflow. -
NEW DATA MODELING CAPABILITIES AND OBJECT TYPES
Data modeling can be externally performed with data modeling tools (e.g. Erwin) that can be imported in MM, and then stitched to a matching imported database. Alternatively, relational databases could be imported in Physical Data Model (PDM instead of a regular imported Model) where local documentation and diagrams could be defined. This PDM capability has been deprecated as it has been replaced (a few years ago) by the introduction of the Relationship and Diagram tabs to any imported database enabling users to automatically detect, define and document relationships, and design ER diagrams. These data modeling capabilities were still limited to relational databases, this new release fully redesigned the data modeling capabilities with many new features:-
GENERALIZED DATA MODEDLING
- Data modeling is no longer limited to relational (RDBMS) databases, but now also supports hierarchical (NoSQL) databases, and object stores (e.g. JSON in Amazon S3).
-
Data modeling is no longer limited to a given RDBMS schema (as with data modeling tools like Erwin for PK/FK relationships),
but now also support relationships and diagrams between Classifier (tables or file) located anywhere:
- in any catalog or schema of a given database server (multi-model of an imported models).
- in any database models (Customer id of a table in the DW database in Snowflake and the Sales database SQL Server).
- in any technologies (PO number of a table in the DW database in Snowflake and the field of a JSON file in Amazon S3).
- Data Modeling is no longer limited to entity relationships of (any) data stores, but now also supports any standard or custom relationships (defined in Manage Metamodel) which now even includes Classifier Map, Feature Map, Semantic Link and way more. Therefore opening the door to multi-purpose business diagrams (as explained below) involving different types of relationships to illustrate a use case.
-
DATA MODELS AS OBJECTS
As with Data Mappings, and Semantic Mappings, Data Models are now models as objects as instances of the new "Data Model" model type (in Manage Metamodel) which includes new object types: Data Model Folder, Entity Relationship containing Column Mapping(s), and ER Diagram containing ER Diagram Object(s).
These new data model objects benefit from the same capabilities as any other custom objects including search (MQL), security, as well as the ability to have comments, and even operate under workflow. -
NEW ER DIAGRAMS
-
as Technical Data Model Diagrams:
represents the primary use case of ER Diagrams fully replacing the use of any external data modeling tool for data documentation, and way more powerful as multi data store and technologies (RDBMS, NoSQL, object stores). -
as Business Use Case Diagrams:
These new diagrams can be more business oriented than a pure technical ER Diagram by allowing graphical decorations and any additional object and relationship types (besides joins or PK/FK), such as a Classifier Map, Feature Map, Semantic Link or any custom relationships to illustrate a use case. -
as Object Navigator/Explorer Diagrams:
Starting from a given object, users can now graphically expand/navigate any relationships with various automatic layouts (e.g. flow). -
Not a substitute for Data Flow and Semantic Flow Diagrams:
Although the new ER Diagrams are multi-purpose for any relationships between entity/object of any model (as explained above), they are not a substitute / replacement of the existing critical interactive analysis diagrams which are:
- Data flow Diagrams for data lineage and impact analysis,
- Semantic Flow Diagrams for semantic definition analysis.
-
as Technical Data Model Diagrams:
-
NEW ENTITY RELATIONSHIPS
- supporting any relationship types (besides joins or PK/FK),
- enabling worksheet / bulk editing of relationships, as well as CSV import/export.
-
NEW ENTITIES
(This feature may be released post GA as cumulative patch)- Allowing the creation of new entities for conceptual / logical data modeling for Enterprise Data Models or new data store requirements.
-
GENERALIZED DATA MODEDLING
-
NEW DATA FLOW LINEAGE ANALYSIS DIAGRAMS
using less objects to render much bigger data flow lineage traces, and allowing:- to decorate objects with tags (such as sensitivity label or PII), and
- to compare the lineage with previous version of that data flow.
-
NEW REFERENCE DATA MODELS
(This feature may be released post GA as cumulative patch)- Code set mappings, and more.
-
NEW BUSINESS PROCESS MODELS
(This feature may be released post GA as cumulative patch)- Business Process Model and Notation (BPMN) compliant diagrams (see https://www.bpmn.org)
-
IMPROVED DATA SAMPLING AND DATA PROFILING
- New data request methods: fast "Top" (now the default) vs. "Random" (reservoir sampling when available on the database) vs. "Custom Query" (on selected tables)
- New data request scope: subset of tables defined by a provided MQL (e.g. tables from a set of schemas, or table with/without a user defined data sampling flag)
- New data overwrite protection (on selected tables) to prevent an automatic data import (e.g. when a previous long random sampling had been performed)
- New data import operation independently of the metadata import operation (the option to automatically perform data import post metadata import remains enabled by default) but explicit data import can now be requested by API or scheduled (Manage Schedules).
-
IMPROVED HIGH LEVEL SHAREABLE USER OBJECTS (Collections, Worksheets, Dashboards)
High level user defined objects (e.g. Collections, Worksheets, Dashboards, or Presentations) now have more powerful sharing capabilities with the notions of Owners, Viewers, and Editors available through a user-friendly UI similar to popular cloud object stores like Google Drive.
Collections, Worksheets, Dashboards, Users and Groups are now available in the global search and MQL. -
NEW USAGE ANALYTICS
A new repository operation "Export analytics" (that can be scheduled on daily basis) allows to generate usage analytics from the repository database, API, audit log and Lucene index into CSV files (by default in$MM_HOME//data/files/mm/analytics
). Such files can be analyzed by the customer BI tool of choice (such as Microsoft PowerBI or Tableau), an example is provided in$MM_HOME//conf/Template/analytics/demo/demo.pbix
). Possible usage analytics currently include:- Control over the usage analytics scope (selected configuration, or entire repository) and the interval (Days, Months, Years).
- User growth and login per day
- User search (count, popularity)
- Object Inventory (model count, model types, object count, object types, object growth)
- Glossary (term count and growth)
- Documentation (object with documentation count and growth, top documented models))
- Data Classification (object with data classes count and growth, top data classes, data classes count and growth)
- Data Lineage (object with lineage count and growth, model connection count and growth)
- User Collaboration (count and growth of endorsements, certifications, warnings, comments, and attachments)
-
NEW SUPPORT FOR MULTI CATALOG DATABASE IMPORT
A critical aspect of importing metadata from large servers is the support for multi-model incremental harvesting where the import bridge can detect changes and efficiently harvest the subset that has been updated. In the case of a large BI server, only the models of the changed reports are imported. In case of a large DB server, only the models of the changed schemas are imported. Not only this multi-model incremental harvesting is much faster, but it also minimizes the space needed in the repository (with version and configuration management license) by reusing the models which did not change.Currently, most database import bridges require the selection of a single database catalog, with the exception of SQL Server that allowed the import of multiple catalogs at once (in such case all schemas of a given catalog were stored as a single model).
With this improvement, the database import bridges from popular large cloud servers like Snowflake, Google Big Query, SAP HANA, Presto, and Microsoft SQL Server (including on Azure) now provide native multi catalog support with multi schemas represented as muli-models. This improvement reduces the amount of models import to configure, reduces the amount of repository storage needed (with version and configuration management license), and accelerates the incremental harvesting. In addition, this improvement also significantly facilitates the automatic metadata stitching (connection resolutions) at the entire database server level, automatically resolving changes on the underlying catalogs, and their respective underlying schemas. Finally this improvement improves data governance by allowing adding responsibilities (Add Roles), at any level from the entire server model, down to any catalog or schema.
-
IMPROVED USER EXPERIENCE
The UI has been significantly improved for better clarity and user experience in many areas, in particular:- Selecting users (or groups) for different use cases (such as filter per user, or dashboard sharing to users) has been harmonized and improved for usability (to search) and scalability to a very large amount of users.
- Managing Users now offers a paginated UI with filters allowing a much improved scalability.
- User Activity log's UI layout has been redesigned with a new look and feel.
- Object History / change log's UI layout has been redesigned with a new look and feel.
-
IMPROVED SEARCH
Search is now implemented by a dedicated Solr server rather local Lucene index files managed by the MM application server. As side effect, the overall performance of the MM server and search has been significantly improved. For example, a the wWrksheet might have a million row that will be indexed and sorted. -
THIRD PARTY SOFTWARE UPDATES
All third party & open source software has been upgraded to their latest versions for bug fixes, improvements, and better security vulnerability protection. For more details, see the published MIMM's Third Party & Open Source Software Usage and LICENSES. -
SECURITY VULENRABILITY UPDATES
Numerous major improvements to resolve any new security vulnerabilities. -
PRE UPGRADE REQUIREMENTS
- Same steps as any previous releases.
- Physical Data Model (PDM) have been deprecated (and replaced by the local data modeling) in 10.0 (2018) but remained available in 11.0 (2022). PDM is now officially EOL and no longer available in 11.1, therefore make sure you that any legacy PDM models was migrated as regular (imported) Models prior to this upgrade.
-
POST UPGRADE ACTIONS
- Same steps as any previous releases.
-
IMPROVED DATA DOCUMENTATION
TBD on external REST API based application using MQL involving Description or Long Description. -
NEW SUPPORT FOR MULTI CATALOG DATABASE IMPORT
TBD on full re-import of the multi-catalog database (e.g. SQL Server or Snowflake),
surrounding ETL/DI tools(e.g. Informatica PowerCenter or Talend), and BI Tools (e.g. Microsoft PowerBI, Tableau),
before taking advantage of the new multi-catalog connection resolutions (i.e. stitching and configuration build) -
IMPROVED SEARCH
TBD on the bundled Solr app running along side the MM app in the same bundle tomcat server.
TBD on the connection to an external existing Solr server.
TBD on the automatic migration from MM app managed local Lucene files to Solr app server.
v11.0.0 (01/31/2022 with above continuous deployment new feature updates)
-
NEW FEATURE OVERVIEW
This new major release brings the key Data Governance (DG) solutions on top of the existing powerful Data Catalog (DC) and Metadata Management (MM) foundations of previous versions. MM already offers all Technical Models (including their metamodels and associated MIMB bridges/connectors) for virtually any data store (file system / object stores / data lakes, RDBMS, NoSQL, DW), Data Integration (DI) and Business Intelligence (BI) tools and technologies, and the list of MIMB supported tools keeps growing thanks to the largest ecosystem of partners. The key feature of this new version is the ability to define and populate Business Models for data management such as reference data, data quality, data trust, data security, data sharing and shopping, data issue management, business rules, business process modeling and improvements, vertical market specific business applications and regulation compliance. MM is pre-populated with standard business models, starting with the Business Glossary which can now be fully extended with custom business objects and associations. The Data Catalog capabilities have been significantly enhanced with automatic data classification (machine learning), and now supporting both data classes (previously semantic types) and metadata classes (metadata query language driven) already pre-populated to detect and hide most popular Personally Identifiable Information (PII). -
NEW METAMODEL MANAGEMENT FOR CUSTOM "BUSINESS" MODELS
Custom "Business" Models can now be defined with customizable metamodels as needed in many data governance related domains such as data management, reference data, data quality, data trust, data security, data sharing and shopping, data issue management, business rules, business process modeling and improvements, vertical market specific business applications and regulation compliance.-
Administrators can use a new Manage Metamodel menu to define their custom "business" models with the full power of object modeling,
all the way to the graphical editing of UML class diagrams for each business models.
See help.
- The modeling starts by defining reusable attributes promoting data standardization among business objects. Such attributes can be of any basic type such as integer, string, date, enumeration, but also more active types like email, web url or phone numbers offering a better user experience (send an email, make a phone call, etc.) See help.
-
Custom "Business" objects are then created based on these reusable attributes, and custom associations can be created,
including regular reference relationships, but also composition links (UML aggregations), and UML generalization allowing to define abstract business objects.
See help.
Custom "Business" objects have a name and icon that can be be searched from an expansive bundled library of icons, customized (e.g. change color), uploaded (from external sources), or even designed in the UI (start from a shape, color, etc.). - Finally, custom "business" objects are associated to custom "business" models, ready to be populated. MM is pre-populated with a few standard (system read only) business models (starting with the business glossary model) and a few model extensions. Associations can therefore refer to business objects across different business models. See help.
- Users can then use the UI for data entry, analysis and reporting on such custom "business" models with the same capabilities as with harvested / imported "technical" models, including their use in the Metadata Query Language (MQL), Worksheets and therefore Dashboard. In addition, Business Models are also offered a new Hierarchy tab allowing to drill down hierarchically in both data entry (including bulk edition) and reporting. Workflow can also be applied to the business model, where the objects for business rule, or business policy can go through an elaborate workflow from proposed, draft, approved all the way to publish (and even deprecated) to the end users. See help.
- Integrators have external bulk editing/reporting available through CSV import/export capabilities, as well as REST API, allowing to define actual connectors (bulk or real time sync) with the actual tools / applications behind the business models such as JIRA for the Data Issue Management model, or the customers's custom DQ applications for their Business Rule Model. See help. See help.
-
Administrators can use a new Manage Metamodel menu to define their custom "business" models with the full power of object modeling,
all the way to the graphical editing of UML class diagrams for each business models.
See help.
-
NEW METAMODEL MANAGEMENT FOR IMPORTED "TECHNICAL" MODEL EXTENSIONS
Imported "Technical" Models are based upon predefined metamodels associated with MIMB bridges/connectors for virtually any data store (file system / object stores / data lakes, RDBMS, NoSQL, DW), Data Integration (DI) and Business Intelligence (BI) tools and technologies. Such predefined technical metamodels can now be extended for data documentation purpose with the same Manage Metamodel (admin) UI used to define custom "business" models.
Therefore, the new Manage Metamodel (admin) UI not only allows the creation of new custom "business" objects (for the new custom "business" models), but also the creation of new imported "technical" objects defined (scoped) as a set technical objects predefined in the import bridge metamodels. For example a new generic "data field" imported object can be defined as either a RDBMS table column, a NoSQL JSON field, CSV field. etc.
Consequently:- Custom Attributes can now be defined and applied the same way (and therefore reusable) for both imported "technical" objects and custom "business" objects. Not only this eliminates the previous Manage Custom Attributes (admin) UI, but it more importantly avoids redefining the scope of each custom attribute applying to similar imported objects (as it was frequently the case for table/file/entity or for column/field/attribute).
- Custom relationships can now be defined from a custom "business" objects to imported "technical" objects. For example, a new custom model called "business policies" can contain a new custom object called "business rule" which can have an "enforce" custom relationship to an imported object called "data field" as defined above.
-
Custom relationships can be be optionally set to be involved in the semantic flow.
In the above use case, this allows the semantic flow tab to not only include term definition of a table column, but also include the business rules, all the way to business policies.
- This allowed the implementation of the term classification process (Now called term documentation) with an actual custom relationship "Defines" from "Term" to a new predefined "Imported Object" in the predefined standard metamodel. Although this new term documentation implementation as a relationship has no impact or direct benefits to the user experience, it offers solutions to the continuous changes in technology and architectures. For example, the data documentation (including term documentation) of a well documented data warehouse on prem (e.g. Teradata) can be exported and reimported to a new implementation of that same data warehouse on cloud (e.g. snowflake).
-
IMPROVED DATA DOCUMENTATION AUTOMATION AND PRODUCTIVITY
The data documentation process of imported "technical" models is a critical part of any data catalog. Any imported object (e.g. tables/files, columns/fields) comes with a physical name which needs documentation with at user friendly (logical) name and description which are now better presented and managed in 3 categories: See help.- Business Documentation offers a local documentation with a business name and business description. This can be used as an alternative of the term documentation below, or a mean to supersede an existing term documentation with a better local definition.
- Term Documentation (previously called term classification) allows to document any imported object with one or more terms from a glossary (now creating a "Is Defined By" relationship)
- Mapped Documentation allows to document any imported object connected by a semantic mapping with one or more terms from a glossary, or entities/attributes from a data model.
- Inferred Documentation provides data documentation on any imported object automatically generated from other objects involved its data flow pass-through lineage and impact. This is a powerful feature dramatically increasing the automatic data documentation coverage on many data stores (ODS, data lake, DW) of the Enterprise Architecture.
New "Term Documentation" and "Inferred Documentation" attributes are available in the REST API, MQL, and therefore worksheets and dashboards allowing to create KPI graphical widgets on the data documentation coverage. -
NEW DATA CLASSIFICATION
is a critical part of data cataloging automation and therefore received major enhancements from the previous concept of Semantic Types now renamed Data Classes:- New Data Classes of type "Metadata" which is a metadata driven classification process powered by the Metadata Query Language (MQL) allowing one to detect classes by metadata name (e.g. field / column/ attribute name) which is critical to detect many PII that cannot be detected by data sampling/profiling such as as maiden name, date of birth or place of birth. See help.
-
Improved Data Classes of type "Data" which is the classic data sampling driven data classification process based on:
See help.
- Enumeration such as a list of codes / values,
- Patterns such as a US SSN with 999-99-999
- or regular expressions such as a US ZIP with ^[0-9]{??5}??(-[0-9]{??4}??)?$
- new control over matching threshold and uniqueness threshold.
- new machine learning based automatic discovery of data class patterns or enumerations (e.g. automatically learning new code values)
- new server side re-classification on demand (e.g. after adding new data classes) therefore no longer requiring one to perform a new data sampling / profiling to take advantage of the new data classes.
- Improved Data Classes of type "Compound" (e.g. PII) based upon multiple data classes of data detection type (e.g. SSN) or metadata detection type (e.g. Date of Birth) allowing one to hide PII within any data sampling / profiling without any machine learning or customization to start with. See help.
- MM is now pre-populated with PII data classes of type data (as previously for SSN), but also new PII data classes of type metadata (e.g. Data of Birth), and new PII data classes of type compound combining all types of PII data classes. See help.
- Redesigned the data classification architecture to be processed on the MM server side allowing for on demand / refreshed automatic data cataloging (e.g. after new data classes are created/updated).
-
NEW OBJECT SENSITIVITY LABELS
- A new Manage Sensitivity Labels (admin) UI has been created to define sensitivity labels as an ordered flat list such as: Unclassified > Confidential > Secret > Top Secret. Each sensitivity label has a description, a hide data property (only used when applied to a column/field), and a color (for example confidential can be orange and top secret red). By default, no predefined sensitivity labels are defined, this means that this feature is disabled by default. See help.
- Sensitivity labels can be manually applied by authorized users (with a role that includes the Data Classification capability) to any individual object from an entire model, a report, a schema, table, all the way down to a column. Note there is no inheritance such that setting up a schema secret does not make each of its tables and respective columns as secret. Sensitivity labels can also be set in bulk (e.g. multiple columns at the same time). See help.
- Sensitivity labels can automatically be set as "Sensitivity Label Data Proposed" through the automatic data classification detection. For example, a data class SSN can be associated to a sensitivity label called Confidential or GDPR. In such case, any table columns or file fields detected as SSN will also automatically be set with that Confidential or GDPR sensitivity label. Note that in such case the approval process of data classes also applies to sensitivity labels. In addition approving a data class detection on a given object also approves its associated sensitivity label. See help.
- Sensitivity labels can be automatically inferred as "Sensitivity Label Lineage Proposed" following the data flow lineage (similar to the Inferred Documentation concept), but instead going through any data flow (with transformation or not). This powerful new solution allows automatic sensitive label tagging across the enterprise architecture, and has been implemented and optimized through a server cache detecting any data flow changes in the configuration. As with "Sensitivity Label Data Proposed", the "Sensitivity Label Lineage Proposed" can be rejected, therefore stopping the propagation of inferred sensitivity level in that data flow direction. Note that the propagation of inferred sensitivity level is also stopped by any data masking discovered within the ETL/DI/Scrip imports involved in that data flow. See help.
- Sensitivity labels are highly visible in the UI (at the top of any object overview), and can be query through MQL (in the UI or REST API). Applications can be built to query these sensitivity labels in order to automatically generate / enforce data security on the data stores (e.g. databases or file systems with Rangers). Note that sensitivity labels do not directly set or bypass the role based security of the MM repository, or automatically hide data from the MM repository (these actions can be set separately in MM). See help.
-
NEW OBJECT CONDITIONAL LABELS
- A new Manage Conditional Labels (admin) UI has been created to define conditional labels based on the Metadata Query Language (MQL) such as "Highly Commented" based on objects with over 10 comments. See help.
- Each conditional label has a name and icon that can be be searched from an expansive bundled library of icons, customized (e.g. change color), uploaded (from external sources), or even designed in the UI (start from a shape, color, etc.). See help.
- Conditional labels are visible in the overview page of any object. See help.
- Conditional labels can be displayed in search results and worksheets. See help.
- Conditional labels can be displayed in data flow lineage diagrams.
-
NEW OBJECT WATCHER AND EMAIL NOTICATION
- The Manage Emails (Admin) UI (used to setup notification emails) has been extended to enable the Watcher capabilities at the server level (with an adjustable frequency where the server will check every 15 minutes by default but can be set to hourly or more to avoid loading too much the server). See help.
- The Manage Object Roles (Admin) UI has been extended with a Watcher Editor capability (allowing a user to start/stop watching an object), and a Watcher Manager capability (allowing to add/remove anyone as a watcher of an object) See help.
- The Manage Users (admin) UI has been extended to set the watcher notification frequency (of a given user) from daily (default), never (i.e. turned off), to near real time (as setup at the server level). Note that the same watcher frequency can also be setup by each individual user on their top right menu for user profile / preferences. See help.
- With all above Watcher capabilities configured, users can now see a watcher icon ("eye") on the object overview page at the top right (next to the sensitivity label, and endorsement icons and menus). The watcher icon shows the count of watchers on that object and offers menus for the user to start/stop watching that object, and possibly add/remove other user watchers if authorized (with with Watcher Manager capability). See help. A new "Watchers" attribute is available that can be used on search, MQL, or for UI customization. See help.
- The Watcher capabilities are supported on both imported "technical" models and custom "business" models. However, the watcher capabilities are available at the model level only (e.g. not down to just a column or a term object). In case of imported "technical" models harvested as multi-model, one can watch the entire multi-model (e.g. entire database server), or individually watch any desired sub-model such as a given schema of PostgreSQL or a given Workbook of Tableau. See help.
-
Watchers of imported "technical" models receive a separate email per model and per type of activity as follows:
See help.
- Any metadata harvesting driven changes at any level (e.g. add/delete/update of any schema/table/column/type) as soon as (in real time) an import (incremental harvesting) is successful with changes, or failed. In such case, the watcher notification email includes a change summary statistics (e.g. number of added, deleted, updated objects), and a MM server URL link to its model version comparator report for full details.
- Any other changes such as data documentation (e.g. business name, description, or term classification), social curation, etc. at any lower level (e.g. table, column, data type) as often as defined by the server or a user. In such case, the watcher notification email includes a change summary statistics (e.g. number of changed objects), the top 5 changed objects (with a MM server URL link to the overview page of that object), and finally the detailed changes (with a MM server URL link to the search UI filtered by the content of that model and ordered by last modified),
-
Watchers of custom "business" models, Data Mapping, Semantic Mapping, Physical Data Models receive a separate email per model on any change at any level.
See help.
- Any changes at any level (e.g. add/delete objects, update attributes, add/delete relationships, etc.) as often as defined by the server or a user. In such case, the watcher notification email includes a change summary statistics (e.g. number of changed objects), the top 5 changed objects (with a MM server URL link to the overview page of that object), and finally the detailed changes (with a MM server URL link to the search UI filtered by the content of that model and ordered by last modified),
-
Independently of the above watcher capabilities, other notification emails are also sent to users based on his/her roles/capabilities, including:
See help.
- Workflow transitions on objects where the user has a workflow role, as often as defined by the server or a user
- Configuration changes (e.g. add/remove model, edit connections) as often as defined by the server or a user, or build errors (in real time) on configurations where the user is a repository manager of that configuration.
- Server errors (e.g. server down) in real time to the user with Application Administration capability.
-
NEW OBJECT ROLES & GLOBAL ROLES
-
User roles are no longer pre-defined or hard coded, but instead custom built upon an extensive set of elementary capabilities, such as metadata viewing, data administration, workflow editing, etc.
User roles and capabilities are either: - Administrators can associate users or group of users with object or global roles, this association is referred to as a responsibility. This way, one may quickly assign roles and those associated responsibilities to individual users or entire groups of users, as needed. See help.
-
Data governance, data cataloging, data administration, metadata management, data quality, etc., have a variety of "flavors", best practices, roles and responsibilities. Nevertheless, MM is pre-populated with a set of essential roles (such as Content custodian, Data Owner, etc.) that can then be customized.
Because MM is so flexible in implementation and nearly infinitely customizable, one may tweak or even re-engineer the groups and roles to fit the very specific needs of a given organization.
- A set of global roles to provide high-level administration of the metadata management environment. See help.
- A set of commonly used object roles to allow assignment of all of the capabilities available in the product. See help.
- More specific global and object roles tailored to specific metadata management activities, scenarios and use cases as identified in the user guide.
- Specific global and object roles tailored to specific modeled business processes.
- A RACI (Responsible, Accountable, Consulted, Informed) based example of global and object roles and their assignment.
-
User roles are no longer pre-defined or hard coded, but instead custom built upon an extensive set of elementary capabilities, such as metadata viewing, data administration, workflow editing, etc.
-
NEW WORKSHEET ATTRIBUTES
- The "Stewards" attribute have been moved to the new concept of roles. Therefore, Stewards has been moved from the attribute sheet widget of the Overview tab to a new dedicated Responsibilities tab. Note that that a new widget for Responsibilities is also available for anyone to add in the Overview tab if desired.
- The "Used" attribute has been renamed to "Has Semantic Usage". More new lineage attributes are also available: "Has Semantic Definition", "Has Data Lineage", and "Has Data Impact". This allows to detect unused objects. All these attributes can also be used as filters.
- The "Semantic Types" attribute has been renamed "Data Classifications".
- The "Inferred Semantic Types" widget of the Overview tab has been moved to a new attribute "Data Classifications Matched". More new data classification attributes are also available: "Data Classification Rejected", "Data Classification Approved".
- The "Term" attribute has been renamed "Is Defined By" and is now a list of Terms instead of one.
- A new "Term Documentation" attribute shows the list of terms (name and decription) documenting the object (it can also be used as a filter).
- A new "Mapped Documentation" attribute shows the list of semantically mapped objects (name and decription) documenting the object (it can also be used as a filter).
- A new "Inferred Documentation" attribute shows the list of terms (name and decription) indirectly documenting the objects through its pass-through data lineage / impact (it can also be used as a filter).
- A new "Documentation" attribute shows the summarized documentation of the object. The summarized documentation returns the first documentation found on the object following the following priority: Business Documentation > Term Documentation > Mapped Documentation > Inferred Documentation > Imported (Documentation) > Searched (Documentation). This attribute can also be used as a filter.
- "Business Name Inferred", "Business Name Inferred Origin", "Business Description Inferred", "Business Description Inferred Origin" attributes have been deprecated (but still available in this release) as they have been replaced by the new "Documentation" attributes.
- The "Documentation" attribute of glossary terms has been renamed to "Long Description" to not conflict with the new "Documentation" attribute described above.
- New "Data Profiling" attributes have been added "Data Profiling"."Distinct", ."Duplicate", ."Empty", ."Valid", ."Invalid", ."Min", ."Max", ."Mean", ."Variance", ."Median", ."Lower Quantile", ."Upper Quantile", ."Avg Length", ."Min Length", ."Data Profiling", ."Max Length", ."Inferred Data Types".
- The "Certifications", "Endorsements", "Comments", "Warnings" attributes have been renamed to "Certified By", "Endorsed By", "Commented By", "Warned By". In addition to previously supporting filtering, they can now be used as columns showing the list of users that "Certified", "Endorsed", "Commented" or "Warned" the object.
- The "Endorsement Count", "Comment Count", "Warning Count" attributes have been added to the list of possible filters, allowing to produce worksheets/dashboards with popular objects and more.
- The "Certified" attribute was added to the list of filters, again for data governance worksheets/dashboards.
- The "Parent Object Name" and "Parent Object Type" attributes have been added.
-
Object roles can be used as columns or filters
- filter example: expandedMembersOfRole('Steward') = ANY('Business Users')
- select example: membersOfRole('Steward') - Object relationships/children can be used as columns.
- Term's workflow "Status" and "State" attributes have been changed into the different and more generic attributes "Workflow State", "Workflow Published", and "Workflow Deprecation Requested" that now apply to any object of a user model under workflow.
- The "Last Modified Date" and "Last Viewed Date" attributes have been renamed to "Updated Date" and "Viewed Date"
- The "Created Date", "Created By", "Updated By" attributes have been added (also available as filters). "Created Date" and "Created By" attributes only apply to non-imported objects
-
NEW WORKSHEET FEATURES
- In addition to sorting by Name and Relevance, new ability to sort by "Updated Date" in search, worksheets and object explorer (ORDER BY "Updated Date" in MQL)
-
NEW CLOUD IDENTITIES MANAGEMENT FOR METADATA HARVESTING
New MIMB infrastructure allowing password parameters to be based on external (MM managed) cloud identity (on Amazon Web Services, Google Cloud, or Microsoft Azure) where the Secret / Password parameter can be:- A secret identifier which is a URL to a cloud identity secret vault's actual secret (allowing for external storage of such secret / password in a cloud secret vault).
- Empty (no longer mandatory) and the authentication is based on the cloud identity on select bridges (such as Microsoft Azure Data Lake Storage, Microsoft Azure Blob Storage, and more to come).
-
IMPROVED METADATA REPORTING AND PRESENTATION
-
with new graphical widgets (e.g. Responsibilities, Relationships) on object page presentations (e.g. Overview).
See help. - Manage Default Presentation now supports import/export between servers.
-
with new graphical widgets (e.g. Responsibilities, Relationships) on object page presentations (e.g. Overview).
-
IMPROVED METADATA QUERY LANGUAGE (MQL)
no longer needing the special character syntax on attributes, and added support for a lot more "system" objects related to data sampling, data profiling, data classification, user roles, and workflow actions dramatically extending the power of the metadata reporting (worksheets and dashboards), and even superseding the hard coded implementation of menus like My Workflow Actions, or My Term Changes now renamed My Changed Objects which is full customizable with MQL.
For more details on the MQL changes, see the MQL new or improved features, deprecated features, and removed features. -
IMPROVED REST API
especially in support of the new features of this version such as classification, object and global roles.
For more details on the MQL changes, see the new or improved API methods, deprecated API methods, and removed API methods. -
IMPROVED ADMINISTRATION UI
such as Manage Users, Groups, Roles. Classification for a more harmonized and intuitive look & feel and more improved editing capabilities. removed API methods. -
CHANGES FROM PREVIOUS VERSIONS
-
Custom Attributes are now defined within Manage Metamodel.
- All licenses allow to edit custom attributes within Manage Metamodel, however defining new custom objects, relationships and models require a specific license.
- The repository database upgrade includes a "Migrate Custom Attributes to Metamodel" operation (see Manage Operations for log).
-
Glossaries are now defined within Manage Metamodel.
- The "Standard" package includes a new Glossary metamodel with new KPI and Acronym objects, but removes the Category object which was the only (previous version) way to create a Term hierarchy. Terms can now contain Terms, and no longer require the creation of a Category.
-
The repository database upgrade includes a "Migrate Glossaries to Metamodel" operation (see Manage Operations for log).
- The (previous version) Glossary Categories are now migrated as Terms without loss of the hierarchy at the instance level (i.e. glossary term path). Any MQL use of Categories has been updated to Terms as part of the migration. Note that count of Terms might be higher as Categories are Terms.
- The (previous version) Glossary Term predefined (hard coded) relationships (e.g. More General/More Specific, Contains/Contained by, Represents/Represented by, References/Referenced by, See Also) are migrated as part of an optional "Glossary Extension (MM)" package without loss. However, these relationships are not a mandatory part of the "Standard" package, as users can now define better custom relationships with appropriate names.
-
The (previous version) Glossary Term had 2 workflow related properties:
- Status has been renamed "Workflow State" with the same values (Candidate, Draft, Under Review, Pending Approval, Approved, Published, and Deprecated).
- State (Deprecated, New, and Published) is replaced by "Workflow Published" (True/False) and "Workflow Deprecation Requested" (True/False).
- The (previous version) Glossary Term Abbreviation attribute are migrated to a new "Has Acronym" relationship to a new "Acronym" object. Note that Alternative Abbreviation is not migrated.
- The (previous version) Glossary Categories and Terms were used to implement naming standards, which are now implemented by a "Naming Standards" model which contains "Naming Standard" objects containing "Naming" objects. When enabling naming standards, users now have to select which "Naming Standard" object they want to use (instead of a glossary category). The repository database upgrade includes a "Migrate Glossaries to Metamodel" operation which also migrate any Categories and Terms used for naming standards into new objects of the "Naming Standards" model.
- The metadata harvesting browse path (in import/export bridge parameters) is no longer defined as * by default (allowing to browse any drives, directories and files) for obvious security vulnerability reasons. Administrators must use the Setup UI or command line to define the scope of file browsing.
-
The REST API help (MMDoc.war) and any other Tomcat Web Apps
are no longer enabled by default for security vulnerability reasons (Swagger unauthenticated sensitive endpoints).
They have been moved from
$MM_HOME/tomcat/webapps
to$MM_HOME/tomcat/dev
. If desired, these webpass can be enabled with the Setup UI or command line as follows$MM_HOME/Setup.sh -we mmdoc
. This will create the contextMMDoc.xml
in$MM_HOME/tomcat/MetaIntegration/localhost
to make the webapp available and start it. - The dashboards can no longer store and execute user defined java scripts for security vulnerability reasons (to mitigate the XSS vulnerabilities). Consequently, a new third party (DOM Purify) strips all XSS properties when rendering the custom HTML of the HTML widget. Therefore, if the HTML contains any tags that are not purely formatting tags (font color, size, images,.), then they will be removed automatically before displaying the HTML. In addition, all HTML attributes allowing to enter javascript (onclick, onload,.) are also be removed.
-
Custom Attributes are now defined within Manage Metamodel.
-
THIRD PARTY SOFTWARE UPDATES
All third party & open source software has been upgraded to their latest versions for bug fixes, improvements, and better security vulnerability protection. For more details, see the published MIMM's Third Party & Open Source Software Usage and LICENSES. -
SECURITY VULENRABILITY UPDATES
Numerous major improvements to resolve any new security vulnerabilities. -
PRE UPGRADE REQUIREMENTS
- Successful cleanup of the repository See help.
- Successful upgrade to the previous major release, including the post upgrade manual migration of any single-model database import (deprecated in previous release, now not supported by still working) into multi-model database import. For more details, see the POST UPGRADE section of the previous version release notes.
-
Successful update to the latest MIMM and MIMB cumulative patches for the previous major release
for the main MIMM server, as well as all metadata harvesting servers.
Post patching best practice assumes:
- successfully re-harvesting (import) of all models (in order not to blame the new major release later),
- rebuilding of all configurations,
- deleting any unused versions of models and configurations (it may 3 days for the database to purge deleted models),
- and making sure that the database maintenance and search index are up to date.
- Successful database backup and restore of the MIMM repository using the actual database restore/backup technology (do not use the MIMM application backup).
- Successful backup of the install data directory.
- Clean install of the latest build of the full MIMM in a new directory (do not reuse/overwrite the previous version install directory)
-
Repository on PostgreSQL Database:
-
If you were using the bundled PostgreSQL database server which is only available Windows version of MIMM,
then this database first need to be upgraded to the new PostgreSQL 13.2 bundled with this new version of MIMM:
(See Application Server Upgrade > Reconfigure your MIMM Database Server (ONLY if you are using the bundled PostgreSQL database on Windows).
-
If you were using the bundled PostgreSQL database server which is only available Windows version of MIMM,
then this database first need to be upgraded to the new PostgreSQL 13.2 bundled with this new version of MIMM:
-
WARNING: Finally, it is highly recommended to first test in a completely separate QA environment configured as follows:
- Assuming all above steps have been performed in the production environment, make a full (dump) copy of the MIMM repository database instance into a new one (it is recommended to temporarily stop the MIMM application server and database server for that)
- Perform a clean install of the full MIMM software on an empty directory of the new QA machine.
- Copy the data directory from the MIMM production installation directory to the QA installation directory (this will avoid a full lucene re-indexing)
- Use the setup utility to point to the new QA repository database instance, and start the MIMM application server.
-
POST UPGRADE ACTIONS
-
Level 1 - Review the logs for any automatic upgrade migration errors
-
MANAGE > System Log
should look as follows in a typical successful upgrade migration:
MIRWEB_I0044 Starting database upgrade. Product version is 31.31.2 whereas database version is 30.22.2.
DBUTL_I0031 Updating database from version 30.22.2
...
MIRWEB_S0005 Running operation: Upgrade data mappings to create classifier links
MIRWEB_S0005 Running operation: Migrate Custom Attributes to Metamodel
MIRWEB_S0005 Running operation: Migrate Business Glossaries to Metamodel
MIRWEB_S0005 Running operation: Migrate Term Classification links to Metamodel
MIRWEB_S0005 Running operation: Migrate Hide Data property to Sensitivity Label
MIRWEB_S0005 Running operation: Migrate steward of content with the 'Send email notification when an import' option to watcher
...
SEARCH_I0005 Indexing model [463,1].
...
MIRWEB_S0101 Server is initialized.For automated test purpose, the following messages can be searched in the log:
-
On a successful start/upgrade of the server:
MIRWEB_S0101 Server is initialized. -
On a failure to upgrade the server:
MIRWEB_F0003 Service initialization error: -
On a failure to start the server:
MIRWEB_F0004 General error during service initialization:
-
On a successful start/upgrade of the server:
-
MANAGE > Operations
provides the details for each migration operation (in case of errors on above system log):
- Upgrade data mappings to create classifier links
- Migrate Custom Attributes to Metamodel
- Migrate Business Glossaries to Metamodel
- Migrate Term Classification links to Metamodel
- Migrate Hide Data property to Sensitivity Label
- Migrate steward of content with the 'Send email notification when an import' option to watcher
-
MANAGE > System Log
-
Level 2 - Test Previous Release Basic Features (Object Search, Explore, Edit, Trace Lineage, etc.)
- Do not start yet any metadata harvesting (model import) and configuration build until you reach level 4 below. In fact, the upgrade process preemptively disables any automatic harvesting post upgrade (see MANAGE > Schedules).
- Make sure you read the above release notes on: NEW WORKSHEET ATTRIBUTES, and CHANGES FROM PREVIOUS VERSIONS.
-
Level 3 - Test Previous Release Customization Extensions (MQL, Worksheets, Dashboard, Presentations, REST API)
- Once again, do not start yet any metadata harvesting (model import) and configuration build until you reach level 4 below.
- Make sure you read the above release notes on: NEW WORKSHEET ATTRIBUTES, IMPROVED METADATA QUERY LANGUAGE (MQL), IMPROVED REST API, and CHANGES FROM PREVIOUS VERSIONS.
-
Level 4 - Test Metadata Harvesting (Model Import and Configuration Build)
-
If you used remote harvesting agents (servers),
you must install new ones (based on this version) in a new directory.
You may also want to copy the data directory from the old install in order to reuse the metadata cache for incremental harvesting.
If you need to have both the previous and current version of the software to temporarily coexist on the the same machine for testing (until moving to production), then you must configure the new agents (servers) to run on separate ports (using the setup utility), and update them accordingly on the main server (using MANAGE > Servers). -
Do not create yet any new model to import with new parameters, instead:
- It is recommended (but not mandatory) to perform a manual full import (no incremental harvesting), as this new version of the software may bring more detailed metadata. In any case, only import a few selected models to start with (and one at the time). We assume here (per above PRE UPGRADE REQUIREMENTS) that such models imported just fine (with the exact same parameters) in the previous version of the software.
- A day Later, you can perform more manual full import (one at the time), this time checking the incremental harvesting works.
- Another day later, you can finally re-enable the scheduled automatic import (metadata harvesting) of that model. Then repeat the above steps again for other models.
-
If you used remote harvesting agents (servers),
you must install new ones (based on this version) in a new directory.
You may also want to copy the data directory from the old install in order to reuse the metadata cache for incremental harvesting.
- Level 5 - Test New Release Features from above release notes
-
Level 6 - Patch upgrades from earlier builds of this version
-
"MANAGE > Secret Vaults" capability has been enhanced and replaced by "MANAGE > Cloud Identity".
The migration should be seamless as the upgrade patch automatically migrates any existing configuration settings for Amazon AWS, Google Cloud, or Microsoft Azure, migrating it all from entries in MANAGE > Secret Vault to cloud identities in MANAGE > Cloud Identity. The ability to use a Cloud Secret Vault to externally store the bridge password parameter is preserved through the migration. Any use of the above secrets vaults as a URL based password parameter of a model import model is automatically detected upon the first (manual or scheduled) import to automatically populate the cloud identity of that model import.
With this improvement, there is now support for more than one cloud identity per cloud technology. In addition, to support the more robust cloud identity features, select import bridges will support more automatic cloud identity-based authentication. In this way, the password parameter is no longer mandatory for those bridges and the authentication may be based on the cloud identity.
-
"MANAGE > Secret Vaults" capability has been enhanced and replaced by "MANAGE > Cloud Identity".
-
Level 1 - Review the logs for any automatic upgrade migration errors
v10.1.0 (GA 01/31/2020 - Deprecated 12/31/2021 - EOL 12/31/2022)
-
NEW METADATA REPORTING DASHBOARDS
with full BI reporting dashboard capabilities (tile layout, widgets for containers, numbers, statistics such as grids, bar or pie charts, etc.) based upon the new Metadata Query Language (MQL) and Worksheets technologies released in the previous version.
MQL based Metadata Worksheets combined with other dimensions (e.g. history/time, users/groups, configurations, etc.) enable users to produce powerful dashboards (e.g. recently approved terms, what's new, workflow to do list, etc.).
As with Collections and Worksheets, users have the ability to save/manage/share Dashboards. -
NEW UI CUSTOMIZATION
based on the above new dashboard technology, allowing users to customize the UI:-
for each Repository Object (e.g for Terms, Tables, Columns, etc.)
where the Overview Tab is now fully editable as a dashboard,
and all other Tabs can be individually hidden to simplify the UI for business users.
Administrators can manage the Default Overview for different metadata presentation to different user groups. -
for each Configuration Home Page (now available by clicking the banner logo)
which is now based on the full power of new dashboards.
Administrators can manage the Default Dashboard for the home page of different user groups.
-
for each Repository Object (e.g for Terms, Tables, Columns, etc.)
where the Overview Tab is now fully editable as a dashboard,
and all other Tabs can be individually hidden to simplify the UI for business users.
-
NEW UI LOOK & FEEL
dramatically improving the user experience:- New Header Banner:
- Simpler customization harmonized between Metadata Explorer and Manager with
$MM_HOME/conf/resources/MM.properties
. - Simpler modern top right menus:
- Tool submenu with access to other tools (e.g. Metadata Manager) and Help.
- User submenu with direct access to edit the user account profile (e.g. define full name, email, and now with photo avatar), preferences, and log out.
- Configuration submenu with direct access to the default, recent and other configurations.
- Simpler customization harmonized between Metadata Explorer and Manager with
- New Header Menus with Objects, Collections, WorkSheets, Dashboards, and Manage with harmonized SubMenus to Explore, Manage, and direct access to Favorites and Recent.
- New Explorer Panel (on the left for navigation) which is now available (harmonized) for Objects, Collections, WorkSheets, and Dashboards.
- New Object Level Attachments such as associating documents to a Term, Table or even Column.
- New Object Preview Thumbnail Image manually uploaded by users, or automatically imported by some bridges (such as a BI Report Preview Thumbnail Image when supported by BI import bridges like Tableau and Spotfire).
- New User Avatar (photo).
- New User Authentication Management UI harmonized between Native, LDAP, OAuth, and SAML.
- New End User's Active Operation Monitoring (see below for details)
- New Administrator's All User Active Operation Monitoring (see below for details)
- Improved Configuration Management (Build) UI (see below for details)
- Improved Column/Field Order to physical order by default (instead of alphabetical order) for table columns and file fields (Note that the model must be fully re-harvested to get the new physical order).
- Improved Data Flow Lineage filtering performance now performed locally on the web client, also reducing the server load.
- Improved Management UI for Users, Groups, etc. with a harmonized simpler UI.
- Improved overall layout, graphics (icons) and presentation.
- New Header Banner:
-
NEW UI INTERNATIONALIZATION
with the ability to translate all the UI menus of both the Metadata Explorer and Metadata Manager including:- The MIMB Metadata Profile based UI (metamodel vocabulary: e.g. schema, table, column, field, dimension, job, etc.)
- The MIMB Metadata import/export bridge UI (bridge parameter names, enumeration values, and descriptions/tooltip)
-
NEW ACTIVITY MONITORING / OPERATION MANAGEMENT
with real time monitoring of any activity such as concurrent model imports (metadata harvesting), a configuration build, and other operations such as a repository backup/restore.
From the UI perspective, a new activity processing icon (spinning gear) appears on the top right of the banner with a counter of concurrent operations. When all operations are completed (no longer active), the gear icon stops spinning. If there was any error in any of the operations, the operation counter is replaced by a red error icon.
At any time, a user can click on the activity monitoring icon to list the operations, and jump to the desired log, before closing/discarding the activity icon upon completion. Note that only the activities/operations running on behalf of the current user are displayed. In addition, Administrators also have a new Manage/Operations panel to list any active operations running on behalf of any user. -
NEW NEW REPOSITORY MANAGER
(MANAGE > Repository) has been fully redesigned to replace the legacy Metadata Manager UI.
Over the past few versions, any new features were developed (and therefore only available) in the modern Metadata Explorer UI. At the same time, all editing capabilities of the legacy Metadata Manager were progressively migrated (while redesigned) into the modern Metadata Explorer UI. The last editing feature was the Semantic Mapper now available in the Metadata Explorer of this new version. All remaining other features (the model version management itself) of the legacy Metadata Manager UI have now been migrated, redesigned and improved as MANAGE > Repository. Links from the Metadata Explorer to Show in Metadata Manager have been replaced by show in Repository Manager. Therefore:- The legacy Metadata Manager UI is no longer linked anywhere from the Metadata Explorer UI. New MIMM users should never be exposed to the legacy Metadata Manager UI. The legacy Metadata Manager UI remains bundled in this version as MM/Manager in the URL However, it is deprecated and will be fully removed in the next version.
- The so called Metadata Explorer UI is now the sole UI as just MM in the URL (instead of MM/Explorer)
-
IMPROVED VERSION AND CONFIGURATION MANAGEMENT
with incremental stitching dramatically accelerating the re-building of configurations based upon the incremental harvesting of multi-models from DI/ETL servers, BI servers (like Tableau), Data Lakes (HFDFS, Amazon S3, etc.), and now large Data Warehouses (Hive, Teradata. Oracle, etc.)- New Configuration Build UI preventing concurrent configuration builds, and provide better configuration status updates.
- New Configuration Model Connection Stitching UI to support the new multi-model databases, including:
- The 2 levels (Database Model, and Schema) of traditional databases server like Oracle, HIVE.
- The 3 levels (Database Model, Database Catalog, and Schema) of Microsoft SQL Servers.
- Improved Configuration Connection UI harmonization between Metadata Explorer and the Manager Manager.
- Improved Change Management Detection by no longer relying on the physical native id of the object (e.g. table) and instead rely on the name space (e.g. schema/table/column), therefore preventing invalid change detection when the database was re-created (backup/restore), or when pointing from development to production server.
-
IMPROVED DATA SAMPLING AND DATA PROFILING
with incremental updates on harvesting. -
ARCHITECTURE, DEPLOYMENT & INTEGRATION
- Improved User Authentication Management (in particular OAuth and SAML) to support the latest version in these standards, and their implementations in popular servers like Azure ADFS.
- Improved Repository Database Space needed with incremental harvesting of the database servers that are now supported as multi-models at the schema level, as every harvested new version will reuse the stable schema sub-models.
- All Third Party & Open Source Software has been upgraded to the latest versions for better security vulnerability protection.
-
UPGRADE REQUIREMENTS:
-
No specific migration preparation steps are needed as this version upgrade is compatible with the previous one, however:
- As with previous upgrades make sure you follow proper upgrade steps including applying the current version's latest cumulative patch, making sure the database maintenance is up to date, and a full database backup has been performed.
- You must perform a clean install of the new software (i.e. this us is not a cumulative patch on top of previous version)
-
If you were using the bundled PostgreSQL database server which is only available Windows version of MIMM,
then this database first need to be upgraded to the new PostgreSQL 12.1 bundled with this new version of MIMM:
(See Application Server Upgrade > Reconfigure your MIMM Database Server (ONLY if you are using the bundled PostgreSQL database on Windows).
-
No specific migration preparation steps are needed as this version upgrade is compatible with the previous one, however:
-
POST UPGRADE:
-
UI Customization:
Any existing customization of the UI Banner in$MM_HOME/conf/resources/MM.properties
or Metadata Explorer in$MM_HOME/conf/resources/MetadataExplorer.xml
needs to be manually copied to the newer simpler harmonized version of that file in$MM_HOME/conf/Template/resources
. -
Database Models:
Large database servers containing multiple schemas (e.g. Oracle, Teradata, etc.) or multiple database instances (e.g. Microsoft SQL Server) are now imported as multiple models (e.g. one model per schema or database instance) with support for incremental harvesting (e.g. only updated or new schemas are imported while unchanged schemas are reused from the harvesting cache). After the upgrade, any newly created database model will be harvested as a multi-model by default. Single large model database import is officially deprecated with this version but still supported, however it will be removed only at the next major release. Therefore, existing database models will go on been harvested as single (large) models for this release only.In order to benefit from the new faster and more space efficient multi-model databases, the existing database models can be upgraded using a dedicated operation (Migrate to multi-model databases) which is available at the database model or directory level (in the Repository Manager). This conversion operation converts the single database model into a multi-model database (one model per schema) while taking care of the migration of the database documentation (business names, descriptions, etc.), connection stitching, and any involvement on data flow or semantic mappings. Note that this conversion operation has the following known limitations:
- When a single-model database content has multiple versions the migration process migrates the latest version only. You can find other versions in the original single-model content moved under the migration folder. These versions will retain relationships to mappings and configurations that use them.
- Web browser bookmarks of objects in the migrated single-model contents are obsolete as they reference these objects using obsolete identifiers.
- Migrated contents retain names (paths) of the original contents but not their identifier. For example, it can invalidate references to single-model objects in Worksheet filters
- Only Diagrams with objects in the same schema are migrated.
- Configuration Architecture diagrams layout of migrated objects could be lost
- Migrated Data Mapping will not have previously broken target expressions
-
UI Customization:
v10.0.1 (GA 04/24/2019 - Deprecated 12/31/2020 - EOL 12/31/2021)
-
NEW METADATA QUERY LANGUAGE (MQL) & METADATA REPORTING WORKSHEETS:
-
New METADATA QUERY LANGUAGE (MQL)
allows users to define powerful and complex metadata queries with the familiar SQL syntax. The MQL is available through the REST API and constitute the foundation of a brand new Metadata Reporting UI. -
New METADATA WORKSHEETS
based on the new Metadata Query Language (MQL) and a brand new UI. Users start from simple search and powerful filters (automatically building MQL queries) to produce a set of metadata on which the user can select the desired columns (metadata properties) to create/save a Worksheet.- Worksheets can be shared with other users (of selected group), and easily managed with quick access to Favorites and search for worksheets shared by other.
-
Worksheets implement the Metadata Tabular Reporting (Grid Mode) solution, and can be switched to simple business friendly reports (List mode).
Worksheets are also foundations (building block) or the total metadata reporting solution which includes Dashboards in the next version. -
Worksheets are a lot more powerful than static reports, they allow users to dynamically interact by refining filters, sorting columns, etc.
In fact, Worksheets also implement the powerful Metadata Bulk Editing capabilities, allowing users to search/filter and easily set some common properties (including custom attributes and curation).
-
New METADATA QUERY LANGUAGE (MQL)
-
IMPROVED USER INTERFACE EXPERIENCE:
-
New CUSTOMIZABLE HOME PAGE
based on an open source sample of a tile based home page with customizable JSON files with short cut icons with internal or external URL, MQL based reports, etc.
Warning: this custom code home page approach is only a short term solution as it will be replaced by the new official full Dashboard capabilities of the next version. -
New METADATA EXPLORER MENUS
have been redesigned to replace the BROWSE menu based on the new metadata reporting capabilities (now with worksheets and soon dashboards) where all existing capabilities (and new ones) have been organized as follows:-
HOME
has been removed from the menu bar (to gain space), and replaced by clicking on the top left home logo. -
A new EXPLORER PANEL
is available on the left side as a pull down arrow menu (the explorer panel can be pined) which emphasizes navigation using a various organizational structures presented as a hierarchy (tree or drill down). The Explorer Panel offers quick access to metadata through:- The Browse tab offers tree navigation from the root of the repository deep into one or multiple models on a single simple tree. It also supports tree browsing from the technical perspective organization of the entire repository, or from the business driven organization of a given configuration. The tree navigation can be set to drill navigation when you need to work with a long list of siblings (e.g., 10K of tables in a database schema).
- The Search tab offers quick access to metadata query/filters and even save them as reporting worksheets (see above).
- The Collections tab offers quick access to predefined metadata Collections (see below).
-
OBJECTS
allows users to search for metadata objects, manage (view, build, favorites, share with me) metadata object collections, look at recently accessed objects (history), browse by object categories (e.g. databases, files). -
WORKSHEETS
allows users to manage (view, design, favorites, share with me) metadata reporting worksheets, look at recently accessed worksheets (history), quick access to favorites, dynamically create worksheets by categories and sub categories (e.g. databases / tables / columns) -
DASHBOARDS
will be available as part of the next version. -
MANAGE
remains unchanged (for administrators only).
-
HOME
-
New METADATA COLLECTIONS
re-branding the previous concept of Lists and harmonizing it with the new Worksheets in terms of UI usage and management. -
Improved GRAPHICAL LINEAGE DIAGRAMS
with better layout for improved object readability and data flow navigation.
-
New CUSTOMIZABLE HOME PAGE
-
ARCHITECTURE, DEPLOYMENT & INTEGRATION
-
New SSO Authentication with SAML 2.0
has been added (in addition to OAuth 2.0 added in the previous version). SAML (Security Assertion Mark-up Language) is an umbrella standard that covers federation, identity management and single sign-on (SSO). In contrast, the OAuth (Open Authorization) is a standard for, color me not surprised, authorization of resources. Unlike SAML, it does not deal with authentication. - Oracle JDK has been replaced by OpenJDK 11.
- All Third Party & Open Source Software has been upgraded to the latest versions for better security vulnerability protection.
-
New SSO Authentication with SAML 2.0
-
UPGRADE:
- There are no upgrade specific steps as this is a minor version upgrade fully compatible with the previous repository. However, you must perform a clean install of the new software (i.e. this us is not a cumulative patch on top of previous version)
- Some java bridges depend on the tool's SDK which is not necessarily compatible with OpenJDK 11 which is now the default JRE for the bridges (typically these tool's older SDK can only work with older versions such as Java 8). In such case the bridge tries to automatically run with the JRE bundled with the tool software / SDK instead. When this is not possible, the bridge offers the ability to manually point to a compatible JRE in the bridge's Miscellaneous parameter. Bridges for tools having compatibility issues with OpenJDK 11, include at least Oracle Data Integration, and SAP BusinessObjects.
v10.0.0 (GA 09/21/2018 - Deprecated 09/21/2020 - EOL 12/31/2020)
-
BRAND NEW USER INTERFACE EXPERIENCE:
-
METADATA MANAGER VS METADATA EXPLORER UI:
- In previous MIMM versions, the MIMM Web Application Server has offered two different User Interfaces (UI) targeting different user communities. The original Metadata Manager UI was designed for the advanced technical users with a traditional development tool layout including multiple panels: tree structure on the left, multi-tab windows in the middle, attributes on the right, and log activities at the bottom. The Metadata Manager UI also presents the highest level of details and complexity of all harvested metadata. The Metadata Explorer UI was initially introduced as a read only UI with simpler metadata for business users offering an easy to use layout for multiple devices, including tablets. The Metadata Explorer became the new UI platform for all new editing capabilities such as the business glossary or data modeling.
- With MIMM v10.0, all other editing capabilities are now available in the Metadata Explorer UI, including data mapping, enterprise data architectures (Configuration editor and model stitching), and even the Administration features like Custom Attributes which are now under are now available in the Metadata Explorer UI > MANAGE > Custom Attributes. Consequently, the Metadata Manager UI is now only necessary (and therefore available) in the MIMM Advanced Editions for repository management (with multi version and configuration management). The MIMM Standard Edition v10.0 is now fully implemented in the Metadata Explorer UI where MANAGE > Repository allows users to directly create models to the default single configuration, import metadata, stitch models (connections), and trace lineage right away.
-
METADATA HOME PAGES:
New metadata home pages with multiple top tabs offer quick access to all key information:
- The first tab is always the Overview tab which provides a dashboard to all critical information and properties.
-
The next set of tabs are specific (metamodel / profile driven) to the type of object, for example:
- Database Table objects have tabs for Columns and Constraints.
- BI Reports (like Tableau Workbook) objects have tabs for Dashboards, Worksheets, and Data sources.
-
The next set of tabs are for the common critical metadata analysis:
- DATA FLOW for data lineage and impact analysis.
- RELATIONSHIPS for detection, management, and curation of relationships (see new features below)
- SEMANTIC FLOW for definition and usage perspectives (see new features below)
- The last set of tabs are for common documentation and administration like: Comments, Attachments and Audit Log.
-
METADATA QUICK ACCESS:
Much improved ways to quickly access the right metadata:
- SEARCH has been massively improved in both real time performance (now based on Lucene) and in functionality as a metadata driven search with natural language semantic search (see new features below)
- BROWSE has also been massively improved in both performance (now also Lucene based) and in functionalities as a metadata asset type driven browser with support for hierarchical display at all levels of any data sources including database, DI, BI, Data Lakes, and even No SQL (like JSON hierarchal structures)
- Enterprise ARCHITECTURE driven graphical navigation allows users to drill down from a top down big picture of the enterprise architecture.
-
METADATA REPORTING:
Brand new powerful unified metadata reporting capabilities where both search and browse end up to the same reporting page which is also directly available at Browse > Report. Starting from search simply predefines the text filtering (e.g. customer), while browsing predefines a category (e.g. database / tables), and direct access to reporting does not predefine anything.
- The reporting capabilities offers to select multiple categories (e.g. database / tables + Flat files) and subset by content (My Data lake + Sales DW database) before drilling down with the following filters:
- Then filtering is available for Last Modified, Stewards, Labels, Semantic Types, Endorsed By, Certified By, Created By, Warning By, and Commented By.
- Finally, more custom filtering per attribute (including custom attributes) common to the metadata subset (e.g. SecurityLevel = Orange).
- Reports can be reused by saving the URL as favorites (further versions will support full report management within the application)
-
METADATA USER LISTS:
Brand new user list management feature allows users to define and mange lists of metadata objects. Just like labels, lists are available anywhere in the UI to add/remove objects, bulk editing, and management. Lists can contain any type of metadata such my favorite list of terms, tables, or reports. Lists can also can contain multiple type of content such as my to do list with terms, tables, and reports in that list. Lists can be shared with other users when marked as public, such as our quarterly review list. Note that lists are flat, therefore not hierarchical and with no sub-list or include concepts.
-
METADATA TAGGING WITH LABELS:
The metadata tagging with labels has been much improved to be harmonized with the brand-new list management experience in order to facilitate adding/removing objects anywhere, grid editing, and more.
-
METADATA DOCUMENTATION:
Much improved ways to document metadata:
- MULTI-LINE TEXT has been introduced (in addition to the previous single line Text for better formatting and layout. In addition, Multi-Line text has been enhanced with support for URL links and embedded image attachments using a JIRA like syntax. Multi-Line Text is not only the default format for all Descriptions and Comments, but is now also available as a new type of Custom Attribute that can be applied to any metadata for documentation.
- RICH TEXT Documentation with (WYSIWYG) Visual Edition is not only the default medium for Glossary Term documentation, but is now also available as a new type of Custom Attribute that can be applied to any metadata for documentation.
- SQL TEXT of SQL View, Stored Procedures and more are now better presented with colored syntax and optional reformatting. Note that this is not a new type of custom attribute but any predefined attribute with SQL is better formatted.
- ATTACHMENTS (such as pictures, documents, etc.) have been enhanced as part of its integration with the new Metadata Explorer, including Management (Drag and Drop), Preview, and Thumbnails that can be embedded in the Text (and Multi-Line text) descriptions, comments and custom attributes.
-
METADATA MANAGER VS METADATA EXPLORER UI:
-
DATA MODELING AND DOCUMENTING ANY HARVESTED METADATA
- In MIMM v9.1, existing data stores such as RDBMS could be harvested as a Physical Data Model (PDM) instead of a simple Model, in order to offer full documentation including business glossary term reuse based upon automatic semantic links, reverse engineering based upon naming standards, data modeling with diagramming, and of course automatic change management (re-harvest/compare/merge).
- In MIMM v10.0, all the above capabilities are now available on any harvestable model content without having to create a PDM. In other words, any data integration, business intelligence, reports, data stores (relational, hierarchical, NoSQL, files, etc.) can be documented at as needed, including support for relational data models. Consequently, all existing PDM in MIMM v9.1 may be converted to Models in MIMM v10.0 without loss of any existing documentation (including diagrams).
-
The documentation (business names and definitions) process has been improved allowing any object (e.g. table, column, report field) to be quickly and easily:
- "Classified" with a local semantic link to a glossary term, without having to use an intermediate Semantic Mapping content, or associating the Model to a Glossary as with the PDM.
- "Documented" with a local business name and definition overwriting any Semantic link (Classified, Mapped or Inferred)
- When harvested databases that are already documented in data modeling tools (e.g. Erwin), such data models can be imported as a separate model and automatically stitched directly to its matching harvested database (without using any semantic mapping model). The semantic stitching is automatically maintained as both the database and its associated data model are independently re-imported/refreshed on regular basis (the stitching will report inconsistencies). From the user perspective, the documentation (business name, descriptions, relationships, diagrams) of any harvested database table / column is automatically inherited from its associated data model.
- Note that MIMM Advanced Edition with Authoring also allows one to use a PDM to create data models from scratch (e.g. design new HIVE table requirements) without pointing to a live database rather than simply documenting existing data stores. The PDM concept is retained for this purpose. Note that a Conceptual/Logical Data Modeling capability may also be added in future versions.
-
DATA CATALOGING
- Brand new Data Cataloging applications well integrated with the existing Data Governance (DG) capabilities, and based upon the solid Metadata Management (MM) foundations with full data linage and powerful metadata version and configuration management.
- Managing both modern cloud based data lakes and classic Data Warehouse (DW) Enterprise Architectures.
- Harvesting metadata from both modern (XML, JSON, Avro, Parquet, ORC) files, Hive tables, and Kafaka messages), and classic (relational tables / CSV files) data technologies.
- Advanced data driven metadata discovery through integrated powerful Data Profiling and Reference Data capabilities.
- Presenting a brand-new business driven Data Cataloging User Interface (UI) experience.
- Integrating forward engineering with self-service Data Preparation, Data Quality, Data Integration, and Business Intelligence design tools.
-
DATA SAMPLING, PROFILING, & SECURITY
- New "Data Viewer" security role allows authorized users (or groups) to sample data (read only access).
- New "Data Manager" security role allows authorized users (or groups) to enable full data profiling of selected objects (tables, files, etc) on demand.
- New data security protection allows Data Managers to set a "Hide Data" property at the column level (e.g. SSN column). In addition, data can also be automatically hidden when detected at the Semantic Type level (see new feature below) where a newly harvested set of files in a data lake may contain sensitive data (e.g. of Semantic Type SSN).
- New Sample Data tab on the home page of any data store object (e.g. file, table) allows users to view sample data to better define its metadata.
- New Data Profiling Statistics are displayed (including graphical reports) on the Overview tab (dashboard) in the home page of any data store object (e.g. file, table) as well as in the properties grid right panel and within the grid view.
-
SEMANTIC TYPES Discovery & Management
- Semantic Discovery (semantic types, patterns/lists machine learning)
-
RELATIONSHIPS Discovery & Management
-
Relationship Discovery using the following methods:
-
Automatically "Inferred" based on:
- Metadata Usage Driven: using the surrounding data flow usage such as joins in DI (ETL Tools, SQL Scripts or Data Prep) and BI (traditional or self-service) activities.
-
On Demand "Detected" based on:
- Metadata Name Matching: for example PurchaseOrder.SKU = Product.SKU or Customer.AccountId = Account.Id)
- Semantic Definition Matching: classified by users to the same glossary term.
- Semantic Type Matching: discovered through data profiling (e.g. SSN syntax or VIN number)
-
Automatically "Inferred" based on:
- Relationship Management with user defined relationships and social curation (e.g. endorsed or certified joins for active data governance generation into a DI or BI design tools)
- Dynamic Data Model diagram generation from Relationships surrounding any object (e.g. table or file).
-
Relationship Discovery using the following methods:
-
SOCIAL CURATION
- Endorsement, warnings, certifications with impact on search ranking.
-
SEMANTIC SEARCH
- Metadata driven search language such as "Customer tables" for any tables with Customer in the name, "tables with SSN" for any table with a SSN column (e.g. for GDPR), or "ROI in Reports" for any reports containing ROI.
-
SEMANTIC MAPPING
-
Major improvements in semantic mapping including in place semantic mapping via two approaches:
- Top-down from Business Glossary Term or Data Model Entity/Attribute,
- Bottom-up from Data Store Tables/Columns or Report Fields.
-
Major improvements in semantic mapping including in place semantic mapping via two approaches:
-
SEMANTIC FLOW
-
Major improvements on the semantic flow analysis now also supporting the documentation process acting as an interactive dashboard for finding definitions that are:
- "Local" (within the Model) that has been either "Imported" (metadata harvesting) or locally "Documented" (edited description overwrite),
- locally "Classified" (within the Model) to an external glossary term,
- directly "Mapped" via a Semantic Mapping Model or direct stitching (e.g. between a database and its data model),
- indirectly "Inferred" through complex data flow pass through and semantic flow (which can be graphically analyzed in the data flow diagram), or
- "Searched" for by name in all glossaries.
-
Major improvements on the semantic flow analysis now also supporting the documentation process acting as an interactive dashboard for finding definitions that are:
-
RELATED REPORTS
- Related Reports are now available on any metadata objects such as files, tables, columns, etc. (instead of just business terms). This allows business users looking at the result of a search to have direct access to a simple list of any related reports in any BI tools (crossing all semantic and data flows without exposing any of the complexity to business users) Such reports can then automatically be opened in their respective BI tool technologies, therefore acting as a multi-vendor BI tool web portal for business users.
-
DATA CONNECTIONS / METADATA STITCHING
- Complete support for file format harvesting and stitching.
- Connection pool factorization (e.g. from DI and BI servers) to minimize the number and complexity of stitching connections.
-
DATA MAPPING Specifications & Design
- The Data Mapping Specifications and the Data Mapping Designs have been fully resigned and merged into brand new Data Mappings that can be used for multiple purposes, including capturing data flow mapping requirements, and even all the way to developing a full data mapping design that may be forward engineered (see Active Data Governance) into SQL Scripts or DI/ETL tool jobs (e.g. Informatica PowerCenter, Talend DI)
- The Data Mapping tool allows for the mapping of multiple source data stores into a target data store in multiple steps with (schema or table level) Bulk Mappings and (column/field level) Query Mappings. The data mapping tool offers new graphical mapping visualization as you map, and new expression syntactical editors when designing joins, lookups, filters, etc.
-
ACTIVE DATA GOVERNANCE
-
From Physical Data Models (PDM) or Models harvested from Data Modeling tools or Relational Databases, with forward engineering:
- to any supported Data Modeling tools (e.g.Erwin),
- to any supported Data Integration (DI/ETL) tools (e.g. Talend DI) as source/target models,
- to any supported BI Design Tools (e.g. Tableau, SAP BusinessObjects, IBM Cognos).
-
From Data Mappings specifications and design, with forward engineering into:
- to any supported BI Design Tools (e.g. Talend DI)
-
From Physical Data Models (PDM) or Models harvested from Data Modeling tools or Relational Databases, with forward engineering:
-
ARCHITECTURE, DEPLOYMENT & INTEGRATION
- Search engine re-designed and optimized with Apache Lucene offering near real time metadata search and navigation (and removing any dependencies of underlying database text search requirements)
- Third party software upgraded to the latest of Java 8, Apache Tomcat 9, PostgreSQL 10 and more for security and performance improvements.
- Single Sign On (SSO) integration architecture has been redesigned for easy external authentication with redirect using custom scripts in any language such as Python (Note that MIMM v9.1 external authentication required custom complex integrated java scripts). This includes new support for SSO Authentication with OAuth 2.0. Post GA cumulative patches will include support for the Security Assertion Markup Language (SAML) standard, and native cloud authentications such as Amazon AWS and Microsoft Azure.
-
GENERAL:
- All Java code is now compiled with OpenJDK.
- All Third Party & Open Source Software has been upgraded to the latest versions for better security vulnerability protection.
-
UPGRADE REQUIREMENTS:
-
CPU & MEMORY:
The MIMM v10.0 now uses Lucene as the search engine instead of using the text search capabilities. Consequently, the overall hardware requirements now demand as much processing resources (CPU and memory) on the MIMM Application Server machine as on the underlying Database Server machine. (MIMM v9.1 use to demand more resources on the database server as all searches were preformed using the database server). -
DATABASE:
The MIMM v10.0 server now requires PostgreSQL database software version 10 or newer.
-
CPU & MEMORY:
-
UPGRADE PROCESS:
-
POSTGRESQL DATABASE VERSION UPGRADE:
If you deployed your MIMM v9.1 server connected to your own PostgreSQL server, then you must first upgrade the PostgreSQL server to version 10 or newer, and make sure the MIMM v9.1 software works still properly after that step. -
MIMM v9.1 SOFTWARE UPGRADE & DATABASE MAINTENANCE:
You must download and apply the latest MIMM v9.1 Cumulative patch (July 2018 or newer), and make sure the MIMM v9.1 software still works properly after that step.
You must then make sure that the regularly scheduled database scheduled has ran successfully and long enough to have no further actives to do (such as purging deleted content from the database) -
MIMM v9.1 DATABASE BACKUP:
You must then perform a full database backup (MIMM, data, indexes and scripts), restore it on a brand new (different) database server, and make sure the MIMM v9.1 server software still works properly after that step.
(Do not rely on the MIMM backup script for that process, you must use real full database backup). -
MIMM v10.0 SOFTWARE INSTALL & DATABASE UPGRADE:
Install the MIMM v10.0 software on a new directory (independently of the MIMM v9.1 install directory). Use the Setup utility to point to the database to upgrade. Starting the new MIMM v10.0 server will automatically perform an upgrade of the database's MIMM data and scripts. Note that MIMM v10.0 uses Lucene on the MIMM server for search, therefore the database will no longer need to use text search indexes and will therefore need less space in the database
-
POSTGRESQL DATABASE VERSION UPGRADE:
-
POST UPGRADE:
-
FLAT FILES:
MIMM v9.1 had no proper support for flat files which are now fully supported as part of the new data cataloging capabilities of MIMM v10.0. Therefore, the early Flat File (CSV or XLSX) prototypes beta bridges have been discontinued and replaced by the import bridge from File System (CSV, Excel, XML, JSON, Avro, Parquet, ORC), or the other new file system / object store import bridges such Amazon S3, Azure Blob Storage, Hadoop HDFS, etc. which can all contain flat files. All these file system import bridges now create multi-model content which can be stitched to data mappings and other DI/ETL models. Any content imported from Flat File (CSV or XLSX) prototypes beta bridges can still be visible in MIMM v10.0, but no further imports can be performed, and migration to the new bridges cannot be automatized (different parameters and different content type from single to multi model). Therefore, new models should be created with the new file system bridges, and the old content (from the prototype beta bridges) should be deleted. -
PHYSICAL DATA MODELS (PDM):
In MIMM v9.1, PDM were used to document (including data model diagrams) existing databases, as well as designing new databases (new tables, columns, etc.). In MIMM v10.0, PDM is no longer needed to document existing databases (including data model diagrams, glossary integration, etc.). Therefore, PDM is now only needed for data store requirements / database design which requires MIMM with "metadata authoring" license. Consequently, MIMM servers without "metadata authoring" license" can no longer create new PDM models, although any existing PDM created with MIMM v9.1 are still operational on MIMM v10.0. The bottom line is that the MIMM v10.0 new way to document data stores / databases is not only as powerful as PDM but it is much more efficient with version management (changes after new harvesting). Consequently, we recommend all PDM models used to document databases to be converted to models which can be performed without any loss of documentation, by means of a conversion script available on PDM models.
-
FLAT FILES:
v9.1.0 (GA 04/24/2017 - Deprecated 12/31/2018 - EOL 12/31/2019)
- HARVESTED MODEL CHANGE DETECTION avoids systematically creating a new version of the content upon scheduled harvesting of large database, data modeling, data integration, or business intelligence servers. This is automatically performed as side effect of any new harvesting by a systematic and efficient (fast) model comparison with the previously harvested metadata This dramatically reduces the needed database space by creating fewer versions, and enable the possibility of reliable notification only if a change really occurred. Note that in previous versions, the incremental harvesting already offered systematic and automatic reuse of unchanged models (e.g. BI Reports) from a large a multi-model server (e.g. BI server)
- SUBSCRIPTION / NOTIFICATION mechanism on select changes at both the repository model level (e.g. a new version of a data store is harvested) and at the model object level (e.g. a terms changed or ready to approve/review in a Business Glossary under Workflow). Anyone may be assigned stewardship roles and thus will be notified as new imported content is harvested, with links back to the new object and the ability to compare using the newly re-written powerful comparator, and the ability to identify impacts of change for any architecture or configuration of assets.
- CUSTOM ATTRIBUTES on any harvested metadata (models) from data stores, data modeling, data integration, and business intelligence tools. Therefore allowing to tag any metadata (from Hive tables to BI reports down at the column/field level) for custom properties such as a company confidential level that can be read for external products on security enforcement and compliance. Note that in previous versions, these same custom attributes could already be applied to authored metadata (models) such as terms of Business Glossaries or tables/columns of Data Models.
- BUSINESS GLOSSARY WORKFLOW for the business users behind the Metadata Explorer UI. Note that in previous versions, a (customizable) Workflow was already available in the Metadata Manager.
- DATA INTEGRATION BROWSING AND LINEAGE for the business users behind the Metadata Explorer UI, allowing them to browse the data integration jobs (from DI/ETL/ELT tools as well as SQL Scripts) and analyze the summary data flow lineage of their execution. Note that in previous versions, the browsing (open) and full detailed data flow lineage of any data integration models (DI/ETL/ELT and SQL scripts jobs) were already available in the Metadata Manager (and still are).
- DATA CONNECTION STITCHING MAJOR IMPROVEMENTS now offering to stitch by column position (needed in SQL insert statements of ETL scripts ), and smart case aware stitching (as some database's name spaces like schema/table/column are case sensitive while others are not)
- BULK DATA MAPPING at the table level.
- MODEL COMPARE/MERGE: The comparison facility has been completely re-written to include comparison for every level of detail for those models with the same profile (e.g., data model from one technology and data model from another). Even entirely different contents (e.g., an ETL and a data model, or a BI Design and Glossary) may be compared, only at a lessor level of granularity (basically at the granularity of stitching, e.g., schema, table, column). Finally, for physical data models (including documentable models based upon harvested database structures) one may use a powerful merge feature again with full control down to any level of granularity.
-
DATA MODELING
Major improvements and new features involving the physical data modeling capabilities centered around two major use cases:
- A data documentation tool enabled through physical data modeling based upon the structure harvested from existing data stores of the of data warehouse and data lake (including traditional RDBMS and Hadoop Hive big data)
- A data requirements tool for new data stores to be defined such as self-service DI to new Hadoop Hive tables in the data lake.
- REST API SDK major enhancements with many new features
- LIGHTWEIGHT MODELS: Model content (such as a harvested databases or data models) can be stored in the repository as a lightweight model (just the XML file), or fully expanded (as both XML file and fine grained repository objects). When retaining many historical versions of a model, using lightweight models saves repository space, and also avoids slowing down the search by not indexing historical repository objects. Lightweight models cannot be directly used in a Configuration or in a Mapping. However, the lightweight model of a data store (such as a RDBMS or Hadoop Hive) can be documented with a Physical Data Model (PDM) for data model diagramming and semantic linking to a Business Glossary (BG). Such a PDM can of course be used in any Mapping or Configuration to be exposed to business users in the Metadata Explorer. Note that lightweight models can immediately (without any loss of performance) be opened in the Metadata Manager (to browse metadata or trace lineage within that model), Compared (with the Model Comparator to analyze the difference between versions), Exported (for example to BI design tool).
- UPGRADE NOTES: The Business Glossary Workflow has been significantly improved in the metadata explorer. Therefore, before upgrading to this version, it is strongly recommended that all business glossaries under workflow have published terms only (this can be achieved by listing all draft terms and either publishing or reverting them to the previously published state). Otherwise, the upgrade process will systematically publish all draft terms in their current state (which can be undesirable if the current state is not what you would like to publish). Finally, after the upgrade, you will need to manually re-enable the Workflow on these Business Glossaries.
v9.0.2 (GA 06/10/2016 - Deprecated 12/31/2017 - EOL 12/31/2018)
-
METADATA MANAGER:
- Major improvements in MODEL VERSION / CHANGE MANAGEMENT preventing the creation of unnecessary new versions of models if the source metadata (e.g. a database, or data model) has not changed since the last automatically scheduled metadata harvesting. This is new feature is achieved by taking advantage of the MIMB's new capabilities to compare the metadata of a newly imported model with a previously imported one in order to detect any change. The major benefit of this new feature is to dramatically reduce the disk space in the repository by automatically deleting unnecessary versions.
- New CONFIGURATION / VERSION CHANGE MANAGEMENT capabilities offering a comparator of versions of configurations.
-
METADATA EXPLORER:
- Major redesign of the Metadata Explorer UI on both the look & feel and actual capabilities.
-
New METADATA SEARCH/BROWSE with FILTERING AND REPORTING capabilities offering:
- New Search (or Browse) with Filtering capabilities on any attributes/properties
- New choice of result display as a classic Google like "LIST", or a new powerful "GRID" offering a spreadsheet like display of multiple attributes/properties at once. Such attributes/properties can be individually selected, reordered, and sorted, basically offering a full metadata reporting solution. The results can of course be downloaded as CSV/Excel files.
-
New METADATA EDITING Capabilities at many levels including:
- Numerous new fast and easy "in place" editing to Rename objects, Edit Descriptions, and more.
-
The new Search/Browse GRID display also offers efficient editing with:
- TABULAR EDITING of multiple objects at once such as Business Glossary Terms, Data Models Tables, or Table Columns
- BULK CHANGE of multiple objects at once, where a search can return multiple objects (that can then be selectively subsetted) for which changes can be performed at once (e.g. Change the Security Threat Level to Orange to a set of tables at once)
-
METADATA AUTHORING TOOLS:
-
Common Features:
-
Major improvements in CUSTOM ATTRIBUTES (also known as User Defined Properties) to objects of the Business Glossaries and Data Models
- Custom Attributes are now common to the MIMM repository (e.g. Security Threat Level = [ Red, Orange, Yellow, Blue, Green ] and are therefore shared between Business Glossaries, Physical Data Models, etc. Therefore, having a centralized place for maintenance (e.g. adding a new value: purple)
- Custom Attributes can now have a default value (e.g. default value is green)
- Custom Attributes now have a much wider scope to be applied at any level from a high level repository object (e.g. harvested Model, Data Mapping, Directory) to fine grain model objects (e.g. A Business Glossary / Term, and/or a Physical Data Model / Column)
- Custom Attributes now have a security group associated to them (e.g. the Security Threat Level custom attribute may only be set by a custom Security Approved group)
- New AUDIT TRAIL for any changes in objects of Business Glossaries and Data Models, including who changed a given attribute and when
-
Major improvements in CUSTOM ATTRIBUTES (also known as User Defined Properties) to objects of the Business Glossaries and Data Models
-
BUSINESS GLOSSARY:
- New Business Glossary editing capabilities are now available to the business users under the Metadata Explorer UI. (includes tablet friendly in place editing, as well as HTML formatting of descriptions, etc.)
-
DATA MODELING:
-
Common Features:
- Major improvements in data modeling DIAGRAM EDITING: entity formatting, relationships editing, automatic layout, etc.
-
PHYSICAL DATA MODEL (data documenter for existing data stores databases, data warehouse, data lake):
- Major improvements (as part of the above common features)
-
LOGICAL DATA MODEL (brand new feature) (enterprise conceptual/logical information modeler):
- This new feature is postponed to the next release
-
Common Features:
-
DATA MAPPING:
-
DATA MAPPING SPECIFICATIONS (Data Flow Mapper)
- Minor improvements and bug fixes
-
DATA MAPPING DESIGN (Data Integration Designer)
- Minor improvements and bug fixes
-
DATA MAPPING SPECIFICATIONS (Data Flow Mapper)
-
Common Features:
-
ARCHITECTURE & TECHNOLOGY:
- Major database performance improvements
- Updated MIMM Web Services which are now based on RESTful API technology (i.e. therefore removing any security vulnerabilities of the older Axis technology)
v9.0.1 GA (12/15/2015)
-
ARCHITECTURE & TECHNOLOGY:
-
100% Java delivery and installation allowing support for Windows as well as a variation Linux/Unix deployments
- The Metadata Management Server (MIMM) can now be installed on Unix/Linux variations.
-
The Metadata Harvesting Agent (MIMB bridges) can be installed:
- Locally (co-located with a MIMM server on Linux) to run 100% java based bridges including JDBC database bridges (Oracle, Teradata, DB2, SQL Server, etc.), big data bridges (Hadoop Hive, HCatalog), and other popular bridges such as CA ERwin xml, Informatica PowerCenter xml, Tableau BI, etc.
- Remotely on a Windows machine for C++ based bridges and COM API based bridges requiring software SDK running on Windows such as CA ERwin native (.erwin) files, and many BI tools like SAP BO Universe, Microstrategy, QlikView, etc.
- 100% HTML 5 (no more Flash) and tablet friendly look & feel, allowing to run on Mac, Tablets, and more
-
Metadata Explorer Customization to offer a better experience to targeted business users of the customer company:
- Customize headers, company logos, menus, search categories, home page (with search widgets), automatic opening of reports (BI portal experience, BI report documentation (adding BG terms)
- Improved Metadata Explorer Search Performance
-
Improved Data Integration (DI/ETL/ELT Import Bridge) Harvesting Performance
- Now offering detailed DI data flow lineage analysis on demand only (in real time), instead of pre-calculating even if unused
-
100% Java delivery and installation allowing support for Windows as well as a variation Linux/Unix deployments
-
METADATA AUTHORING:
-
Improved DATA STORE DOCUMENTER (PHYSICAL DATA MODEL):
- Live harvesting and automatic update (with version management) from popular RDBMS or big data hadoop data store
-
Common Data Modeling Capabilities:
- Data governance integration with Business Glossary including naming standards, reuse of terms, creating of terms on the fly with supervised learning
- Data modeling graphical diagram editor (relationships, annotations, auto layout, etc.)
- Full integration (import and export) with data modeling tools like CA ERwin
-
Improved DATA STORE DOCUMENTER (PHYSICAL DATA MODEL):
v8.0.3 GA (05/19/2015)
v8.0.2 LA (03/31/2015)
-
DATA DOCUMENTER (NEW):
- Live harvesting and automatic update (with version management) from popular RDBMS or big data
- Data governance integration with Business Glossary including naming standards, reuse of terms, creating of terms on the fly with supervised learning
- Data modeling graphical diagram editor (relationships, annotations, auto layout, etc.)
- Full integration (import and export) with data modeling tools like CA ERwin
-
DATA MODEL DIAGRAM VISUALIZER:
- Major improvements with the HTML5 redesign including better scalability, performance and overall layout quality
- New interactive search
- New diagram auto layout
- New dynamic layout of a diagram subset starting from an entity with all related entities with one or two levels of relationships (very useful for large diagrams)
-
SEMANTIC MAPPER:
- New support for BI Report metadata (such as a page of a workbook, a table or pie chart on page, or just the axis of a graph), allowing to document precise items within a given report by associating (semantic links) business glossary terms to them.
-
New support for models within a multi-model server source,
allowing to provide documentation (or glossary) at the high level of a given data model (or BI report) within a multi model server, such as:
- a Data Modeling (DM) repository server with many data models inside (e.g. CA ERwin Mart)
- a Business Intelligence (BI) content server with many BI designs and BI reports inside (e.g. SAP BusinessObjects)
-
BUSINESS GLOSSARY:
- New customizable role driven workflow support (can be turned off) and security enforcement
-
METADATA EXPLORER UI:
- New integrated presentation of Business Glossary terms related to any data store or BI report objects, including the ability to add/remove BG terms documenting a data store or BI report
- Improved search support for auto complete and objects
-
ADMINISTRATION:
- New group based security model (as side effect of the new role based Business Glossary workflow)
-
ARCHITECTURE & TECHNOLOGY:
-
New support for HTML5 only devices like iPad and other tablets (Flash will no longer be needed)
for graphically tracing any data flow or semantic lineage (Lineage Analyzer),
and for visualizing data models (Diagram Visualizer). - Improved search performance for MIMM persistence on SQL Server
- Java 8 (compiled with backward compatibility with Java 7) compliance (Java 6 is no longer supported)
-
New support for HTML5 only devices like iPad and other tablets (Flash will no longer be needed)
v8.0.1 (12/02/2014)
-
FEATURES:
- New "Show Related Reports" (e.g. from a Glossary Term)
-
METADATA MANAGER UI:
- New Metadata Manager look & feel (to match the Metadata Explorer)
- New Business Glossary batch editing
-
METADATA EXPLORER UI:
- New customizable action menus per repository object type (e.g. open BI report with BI tool by default)
- New dedicated web pages for tracing data lineage & impact, and semantic definition & usage
- New access to the Configuration's Enterprise Architecture Diagram
v8.0.0 (LA 10/01/2014)
-
NEW: Business User Interface:
- Brand New (redesigned from scratch) Business User Interface with simplified search, navigation and presentation paradigms for business users to easily access and understand the metadata assets
- Replaces the "Metadata Explorer" UI which was a read only version of the "Metadata Manager" UI with simplified metadata for business users.
-
New Add-on Business Glossary driven BI Web Portal:
- Optimized per BI technology (e.g. SAP BusinessObjects, Tableau)
- Allows for any business users to browse and search for BI reports and understand them (e.g. interactive direct access to business glossary definitions or Data Lineage lookups from any item visualized in the report)
-
Allows for crowd sourcing to document BI reports using the integrated BI Report Documenter:
- Full business Glossary driven BI report documentation with reuse of business terms for definitions, and naming standards for from physical name to logical business name automatic generation
- Automatically maintains full semantic lineage between fields of the BI report and terms of the business glossary
-
IMPROVED: Metadata Authoring (MA)'s Data Documenter:
- Automatic live database update with full version management control
- Full graphical data modeling diagramming capabilities
- Full Business Glossary driven data store documentation with reuse of business terms for definitions, and naming standards for from physical name to logical business name automatic generation
- Automatically maintains full semantic lineage between objects (tables, columns, etc.) of the documented data store (data model) and terms of the business glossary
-
IMPROVED: Metadata Authoring (MA)'s Data Mapper:
- "Data Mapping Specifications" for subject matter experts, data analysts, and business users to define the mapping requirements of new data integration projects, or to document existing data flow implemented in technologies that cannot be captured by metadata harvesting.
- "Data Mapping Design" for data analysts or data integration engineers to design the data mapping with actual transformations (joins, lookups, filters, etc) for forward engineering to ETL tools
- REDESIGNED: Meta Integration Repository (MIR) Database implementation and MIMM Application Server on top
v7.2 (11/01/2013)
- NEW: Metadata Authoring (MA)'s Data Documenter: a web enabled (graphical) Data Modeling (DM) tool for subject matter experts, data analysts, and business users to document existing data stores
- NEW: Metadata Versionning (MV)'s Configuration Builder: a web enabled (graphical) Enterprise Architecture (EA) tool for IT management and Developers to build and manage the life cycle of Enterprise Architectures
- IMPROVED: Metadata Authoring (MA)'s Business Glossary for Data Governance
- IMPROVED: Metadata Authoring (MA)'s Data Mapper: to document certain "Data Flows" of the Enterprise Architecture, or to define the "semantic flow" connecting the Enterprise Business Glossary to the Enterprise Architecture's physical systems
- IMPROVED: “High Fidelity” Data Model Diagram Visualizer
- IMPROVED: Configuration Management of Multi-model Subsets
- IMPROVED: Enterprise Authentication (e.g., automatic Windows-based authentication)
v7.1 LA (04/05/2013)
-
NEW: "High Fidelity" Diagram Visualizer
-
For “Faithful” rendering of the original model diagrams as developed in ERwin
- Retains original object positions and sizes
- Properly supports auto size entities
- Retains display levels for individual objects and the diagram as a whole
- Retains original entity fonts
- Retains original attribute data type alignments
- Retains other shapes and additional documentation on the diagram
- Separates Logical and Physical Diagrams
-
For “Faithful” rendering of the original model diagrams as developed in ERwin
- NEW: Diagram Relationship Analysis allows the user to highlight the PK/FK's involved in a relationship and even generate the relationship join expression which may be cut and pasted for immediate use in SQL queries.
-
NEW: Powerful Search Language
- Allowing for sophisticated / advanced search such as: Any Words, Exact Phrase, All Words, Exclude Words, Wild Card End, Parent and Child, Exact Name, Object Type, Property Type.
-
NEW: Model Browsing Experience
- Offering a separate model browse panel simultaneously displayed adjacent to diagrams and lineage
-
IMPROVED: Semantic and Data Flow Lineage & Impact Analysis
-
Direct access to essential analysis:
- Trace Data Lineage (e.g. views to tables)
- Trace Data Impact (e.g. tables to views)
- Trace Semantic Definition (e.g. physical tables to logical entity)
- Trace Semantic Usage (e.g. logical entity to physical tables)
-
Choice of presentation:
- Simple List (of ultimate sources and/or destinations)
- Advanced Graph (of semantic and/or data flow)
-
Choice of direction:
- Reverse (for Data Lineage or Semantic Definition)
- Forward (for Data Impact or Semantic Usage)
-
Direct access to essential analysis:
-
IMPROVED: Data Flow Advanced Analysis
-
Separation of data flow and control flow
- Data flow (actual movement of data)
-
Control flow (conditions and filters)
- Column Control (which directly impacts values)
- Row Control (which does not directly impact values such as filters)
-
Visualization of Data transformation Type of lineage by using color and/or thickness of lines
- gray lines means data flow with no transformation (i.e. pass through)
- black lines means data flow with transformations (e.g. expression)
- thick lines means data flow with transformation processes (e.g. ETL workflows that can be analyzed when clicking on such lines)
- yellow lines means column control flow (e.g. lookups)
- yellow dashed lines means row control flow (e.g. filters)
- blue lines means semantic flow
-
Separation of data flow and control flow
-
IMPROVED: Search
- Search improvements for SQL Server wildcard/word break searches
v7.0.4 LA (10/12/2012)
-
NEW FEATURES:
- METADATA TAGGING/ANNOTATIONS/LABELS: Ability for business users to "tag" any metadata, at any granularity level from models down to a particular column
- USER COMMENT MANAGEMENT: Ability for business users to "comment" on any metadata, at any granularity level from models down to a particular column. Model Administrators (data stewards) can then review and manage these user comments (e.g. change status from Candidate to Approved). Finally, Model Administrators can push selected comments back into the original tool (e.g. ERwin model)
- LIMITED METADATA EDITOR: offering a model administrator (data steward) to edit the basic metadata documentation such as the Business Name and Descriptions of Tables or Columns.
-
FLOATING LICENSE SUPPORT: offering new concurrent user licensing, with options for one or both of:
- Named (exclusive) users
- floating (first come first served) users
-
MAJOR IMPROVEMENTS:
- METADATA PROFILING: major improvements in accurately representing the original tool metamodel
- LINEAGE TRACING INTERACTIONS: major usability improvements
- DATA FLOW LINAGE SUMMARY: greatly improved the summary lineage presentation
- METADATA EXPLORER SIMPLIFICATION: using context dependent tree navigation, therefore limiting the nesting of left navigation panels
- TOUCH SCREEN TABLET SUPPORT: with full menus available on Action button as a substitute to the right click menu
- CROSS WINDOW/TAB OBJECT NAVIGATION: with Show Object in Model Browser, in Diagram Visualizer, in Lineage Analyzer, etc.
- GENERALIZED WEB LINKS AND BOOKMARKES: to any objects in a Model Browser, in a Diagram Visualizer, in a Lineage Analyzer, as well to the precise context of any actions like search and reporting
v7.0.2 GA (1/31/2012)
General Availability for complete Metadata Management- UPDATED: Metadata Management (MM) for large scale multi model ETL/DI metadata
- UPDATED: Metadata Version & Configuration Management (MV) for large scale multi model ETL/DI metadata
v7.0.1 LA (10/6/2011)
Limited Availability and Limited Functionality for data modeling tool management- UPDATED: User Experience
- UPDATED: Performance, Scalability, and Concurrency Control
- UPDATED: Packaging, Installation, and Administration
v6.2 (10/15/2010)
- NEW: Metadata Authoring (MA)'s Business Glossary Editor for Data Governance
v6.0.6 (12/04/2009)
- UPDATE: Major improvements and bug fixes on MIMB for metadata harvesting especially from all BI tools, and in particular Microsoft SSAS/SSRS. See MIMB ReadMe for details
- UPDATE: Significant performance improvements in display of lineage of large configurations
- UPDATE: Change in display of diagrams for larger models to show smaller subject areas by default, improving performance
- UPDATE: Significant performance improvements in mapper UI when mapping very large models
- UPDATE: Added ability to view and sort by all columns defined for a given profile in the business UI
v6.0.5 (MIMB GA, MIMM GA) (09/28/2009)
- UPDATE: Minor improvements and bug fixes
- UPDATE: Administration->Database tab for database performance management
- UPDATE: New script supporting performance testing and tuning
v6.0.4 (MIMB GA, MIMM beta5) (06/02/2009)
- NEW: Automatic and Scheduled Metadata Harvester, Metadata Mapper, LDAP User Integration, Role based business and technical web user interface, Role based Metadata Security Manager
v6.0.3 (MIMB GA, MIMM beta4) (01/28/2009)
- NEW: Version Manager, Configuration Version Manager, Version Migrator, Business User Interface, Model based Metadata Reporter, Metadata Profiling
v6.0.2 (MIMB GA, MIMM beta3) (10/31/2008)
- NEW: Graphical Lineage Analyzer, Business Lineage and Impact Analyzer, Enterprise wide Metadata Search Engine and Reporter, Enterprise Configuration Management, Metadata Stitcher, Metadata Configuration Manager
v6.0.1 (MIMB GA, MIMM beta2) (07/22/2008)
- NEW: Model based Diagram Visualizer, Graphical Lineage Analyzer
v6.0.0 (MIMB GA, MIMM Beta1) (05/31/2008)
- NEW: Model Import, Model Browser, Model Subsetter, Model Export
4. System requirements
4.1 Important preliminary disclaimer notice on all requirements
The following requirements only define the minimal requirements to run the application server with reasonable performance based on the provided tutorial, or small business use cases. The actual requirements for enterprise wide use cases based on larger models and configurations do require significantly greater resources to obtain acceptable performance.
The following requirements are based on:
- minimal to no network overhead (assuming both the database and Application servers to be locally installed),
- vendor's default install of the current version of their software (with all current service or fix packs),
- no other applications sharing such hardware (starting from a clean machine),
Any other hardware/software configurations are acceptable as long as they provide the same (or better) results on the provided performance benchmark. In such case, if any problem is discovered (e.g. scalability or performance issues), then the customer must be able to reproduce the issue using an environment that conforms to the minimum performance requirements as defined herein.
Potential known issues include (but are not limited to) the following:
- actual usable hardware performance on virtual environments (e.g VMWare configuration and licenses)
- network overhead on remote servers (e.g. bandwidth, proxy, VPN issues, VMWare inter OS network limitations without a proper license, etc.)
- shared resources with competing applications on the same OS, or between OS on a virtual environment,
- licensing limitations (e.g. most database server licenses limit the number of usable core/CPU)
- vendor software known limitations and requirements (e.g. Oracle on VMWare vs Oracle VM)
4.2 Web Client requirements
Users only need an internet browser:
- Google Chrome v67 or newer
- Microsoft Edge v41 or newer
- Mozilla Firefox v61 or newer
- Apple Safari 11.1 or newer
4.3 Application Server Requirements
Hardware Minimum Requirements (based on physical hardware performance, not a virtual environment):
- 2 GHZ or higher quad core processor (4 cores)
- 8 GB RAM
- 10 GB of disk space (all storage is primarily in the database server, although the search index and metadata cache is on disk)
Apache Tomcat Application Server Configuration | |||||||||
Active Concurrent Users see (1) |
Memory
(GB) see (4) |
CPU (cores) |
Database
Conection see (5) |
||||||
Per User | Total | Per User | Total | Per User | Total | ||||
Light Users see (2) | 200 | 0.1 | 20 | 0.1 | 20 | 0.2 | 40 | ||
Heavy Users see (3) | 100 | 2 | 200 | 0.25 | 25 | 1 | 100 | ||
Server Database caching | 10 | ||||||||
Server Search index caching | 10 | ||||||||
Extra safety buffer | 25% | 55 | 25% | 11.25 | 25% | 35 | |||
295 | 56.25 | 175 | |||||||
(1) Active Concurrent Users during peak daytime (among thousands of potential users on SSO / LDAP), and assuming all metadata harvesting and indexing is scheduled over night | |||||||||
(2) Light users are typical business / end users performing search, browse, report, review, comment | |||||||||
(3) Heavy users are typical advanced developers performing editing, mapping and complex lineage. | |||||||||
(4)
Memory (GB) as configured in $MM_HOME/tomcat/conf/tomcat.properties or with the $MM_HOME/Setup.sh utility.
|
|||||||||
(5) Database connections as configured in
$MM_HOME/tomcat/conf/MetaIntegration/localhost/MM.xml
|
|||||||||
Operating System Requirements:
-
Most popular Linux/Unix 64 bit Operation System Versions (such as Redhat).
-
About Java Dependency:
This software is not based on code compiled for any particular Operation System OS (e.g any particular Linux distribution or version ), therefore, the supported OS are 100% based on the supported OS of the underlying Java Runtime Environment (JRE) and Tomcat software (see their supported platforms for further details). -
About Headless Linux:
Recent versions Java Runtime Environment (JRE) such as 11 no longer bundle any fonts, and therefore rely on the underlying operating system fonts. Most bare Linux configurations bundle a minimum set of fonts, including all bare Linux VM offered on most cloud companies. However, when using an extremely bare minimum true headless Linux configuration, fontconfig and libfontconfig1 must be installed and configured on the system. Otherwise the JRE cannot access the fonts needed in order to perform diagram rendering.
-
About Java Dependency:
-
Microsoft supported Windows 64 bit versions (including Windows 2008 Server, Windows 2012 Server, Windows 2016 Server, Windows 2019 Server, Windows 7, Windows 8.x, Windows 10, and Windows 11).
- Ensure that installer is executed with full Administrator privilege
- Ensure that Microsoft .NET Framework 3.5 or higher is installed
- Ensure that all current Microsoft Windows critical updates have been applied
Application Server Engine Requirements:
- Apache Tomcat 9 - 64 bit (bundled)
- Other Application Servers (such as IBM WebSphere or Oracle WebLogic) require manual install/setup, and are therefore not supported by this version.
Java Runtime Environment (JRE):
- OpenJDK 11 - 64 bit (bundled and recommended)
- Other Java Runtime Environment (JRE) (such as IBM Java) require manual install/setup, and are therefore not supported by this version.
Note that Java bridges are compiled backward compatible to JRE 8
4.4 Database Server Requirements
Hardware Minimum Requirements (based on physical hardware performance, not a virtual environment):
- 2 GHZ or higher quad core processor
- 8 GB RAM
- 20 GB of disk space (or more as needed for the data)
For small deployments (or quick proof of concepts), the MIMM software package bundles PostgreSQL (for Windows only) as the MIMM Database Server (which can run on the same machine as the MIMM Application Server) See Application Server Installation and Configuration setup for details.
The MIMM Database Server can reuse your existing Oracle, SQL Server, or PostgreSQL server:
-
Oracle 12c, 18c and 19c - 64 bit
Warning: Oracle Real Application Clusters (RAC) is not supported, especially when configured for automatic load balancing between nodes where a write on one node will need to be replicated on the other nodes.
- The character set of the database must be AL32UTF8 (UTF8).
-
In order to find out what exact Oracle edition/version is actually installed:
sqlplus.exe SYS@<DB-NAME> as SYSDBA
select banner from v$version where BANNER like '%Edition%'; -
In order to find out how much memory is actually available to the Oracle database, it is important to first understand
how Oracle's memory is configuration and used:
-
The actual available System Global Area (SGA) memory can be found using:
sqlplus.exe SYS@<DB-NAME> as SYSDBA
show sga;
select * from v$sga;
select * from v$sgainfo; -
The actual available Program Global Area (PGA) memory can be found using:
sqlplus.exe SYS@<DB-NAME> as SYSDBA
select * from v$pgastat;
-
The actual available System Global Area (SGA) memory can be found using:
-
In order to find out how much processing CPU/Cores is actually available to the Oracle database, query the table
v$parameter for the value of
cpu_count, or query the table
v$license as follows:
sqlplus.exe SYS@<DB-NAME> as SYSDBA
select * from v$license;
-
Microsoft SQL Server 2008 R2 to 2022 - 64 bit
Warning: With respect to cloud versions of SQL Server, the Microsoft Azure SQL Database and SQL Server on Virtual Machines may work if configured properly per further requirements in SQL Server setup below.
However, RDS for SQL Server cannot not be used because you cannot grant UNSAFE ASSEMBLY (according to the documentation) to deploy the stored procedure assembly per further requirements in SQL Server setup below.- Detailed requirements are defined as part of the SQL Server setup below.
- Make sure you apply the current Microsoft patches.
-
PostgreSQL 13.3 (or newer) - 64 bit
- libxml is needed, you might need to rebuild PostgreSQL with it.
Database Administrator privileges are required to install/setup/uninstall the database.
In general, one must ALWAYS install the latest service packs for a given database version BEFORE creating the MIMM database. E.g., for Oracle 11.2 one is required to apply the patches to upgrade to 11.2.0.3, or whatever is the latest patch level at the time. In addition, Oracle 11.2.0.4 must have patch 17501296 applied.
Virtual Memory: For a Windows based database server, be sure to either:
- set the page file size to be managed automatically by OS
- or it should be at least 3 times the memory or RAM size for the machine.
5. Database Server Setup
The MIMM Application Server requires the connection to an existing Database server for metadata storage (metadata repository)
However, a quick install for tests or QA purpose can be achieved by using the bundle PostgreSQL database.
See the section Metadata Management (MIMM) Application Server Setup for more details.
The following database setup scripts and instructions assume the following by default:
The database name and user name can be changed, and the password should of course be different.
Database Name: MM
Database User: MM
Database Password: = MM123!
After the product is fully installed and web connectivity has been made, one may connect to a different database by way of the web based user interface at Tools -> Administration -> Database.
5.1 Database on Oracle
Create a user MM and a database MM with the following privileges:
sqlplus.exe SYS@<DB-NAME> as SYSDBA
-- Delete previous user and database if needed
-- DROP USER MM CASCADE;
CREATE USER MM IDENTIFIED BY MM123!;
GRANT CONNECT TO MM;
GRANT CREATE TABLE TO MM;
GRANT CREATE VIEW TO MM;
GRANT CREATE SEQUENCE TO MM;
GRANT CREATE TRIGGER TO MM;
GRANT CREATE PROCEDURE TO MM;
GRANT CREATE TYPE TO MM;
GRANT EXECUTE ON DBMS_LOB TO MM;
-- If you get the error "Database exception occurred: ORA-01950: no privileges on tablespace 'USERS'"
-- ALTER USER MM QUOTA UNLIMITED ON USERS;
5.2 Database on Microsoft SQL Server
5.2.1 Database Requirement 1 - Case Insensitive
The database must be configured to interpret SQL in a case insensitive manner.
The case insensitive collation must be Latin1_General_CI_AS.
5.2.2 Database Requirement 2 - Mixed-Authentication Mode
The Mixed-Authentication Mode is usually set during the SQL Server installation process.
The Mixed-Authentication Mode can be verified or changed by using the SQL Server Management Studio: first sign in, then right click on the root of the tree (instance of SQL Server Express), go to Security, and finally select "SQL Server and Windows Authentication mode"
5.2.3 Database Requirement 3 - TCP/IP Protocol Enabled
The TCP/IP Protocol must be enabled in the SQL Server Configuration Manager for both the named instance and the client protocols (Make sure you restart the service after changing).
5.2.4 Database Requirement 4 - Login must be Database Owner
The database login (e.g., MM) that will connect MIMM to the SQL Server database must be the owner of the database.
5.2.5 Database Requirement 5 - SQL Common Language Runtime (CLR) Strict Security for SQL Server 2017 or newer
All database intensive operations (not just database maintenance, but also trace lineage, delete models, etc.) are implemented in SQL Server by stored procedures written in C# compiled/delivered in a stored procedure assembly called MIRRepo which has been signed that will be created with permission set SAFE.
If you are using SQL server 2017 or newer,
then CLR Strict Security is enabled by default.
Therefore, the certificate used to sign the stored procedure assembly
must be imported in the database and granted the UNSAFE assembly permission
(See Microsoft SQL Docs on CLR strict security)
using the following commands:
CREATE CERTIFICATE MIRRepoCert FROM BINARY = 0x308203663082024ea00302010202045eece216300d06092a864886f70d01010b05003075310b30090603550406130255533113301106035504080c0a43616c69666f726e69613116301406035504070c0d4d6f756e7461696e2056696577312a3028060355040a0c214d65746120496e746567726174696f6e20546563686e6f6c6f67792c20496e632e310d300b06035504030c044d6d4462301e170d3230303630313037303030305a170d3330303630313037303030305a3075310b30090603550406130255533113301106035504080c0a43616c69666f726e69613116301406035504070c0d4d6f756e7461696e2056696577312a3028060355040a0c214d65746120496e746567726174696f6e20546563686e6f6c6f67792c20496e632e310d300b06035504030c044d6d446230820122300d06092a864886f70d01010105000382010f003082010a0282010100c2ccf729a28a90958f71a68f6acca9f20b5c256b7c76565b2ece0cd1789bec85e9ab538ac38dc268e48c10e17d3eca1aeb14034bc67bafc05475ed013495aada683c74885f12a8bdbf2025ec3c5a0172010e7055ab27a853e77611ee6ae846453702d18ae3080977ddaee50a282b9dab3f077fe1630804b24f05c58280621dc1426fff7115e8a791435687096c09f754608bb9a6ce00002f7131f09cffd417678bddb8f7a703e4e688f2f0af501c52ecef2cbea3d37c45da4239ddb53295adaddb11dc0118b3188adf812c983d5676c5b7356d68e2258ea32cd3216db21dae49df16d2aa1aef39c618e393ce7e1b131b241c557414424fb6c17c825022a5a4270203010001300d06092a864886f70d01010b05000382010100a1db34a6cda0729a796e5ed0fe5b2f4813ff74bf96300c9ca30fb84be44bd7d0bc46c96a0726eae5e829985429ff4ff09b50ece907c5b8c7f8a71f7a16781103d7eaf2e1c7afa39e4774293610e0d04e6b0c76dc9a85891e6f5fed09059960dc7e2a7c1dc14d64aab9718747752d394b22e339da2c7e6ced1626dde991818cbcaf049d8f112a98b2aa2e80d1168f797a6c992e304e4572b4edcf40d270a281f82d7bde64e8d8b5d83574ecf5470f3d1a9d710498e133e9309a043f63b1682972678fba2a33267999795b5d040524e2f875b667dcec08d310e27b6086b2667dde70d4401fe501944f70581e559d5f3f5b72e49ff722e58594b84a8d15d5dd1414;
CREATE LOGIN MIRRepoCertLogin FROM CERTIFICATE MIRRepoCert;
GRANT UNSAFE ASSEMBLY TO MIRRepoCertLogin;
Alternatively, you can disable the SQL Server CLR Strict Security as follows:
EXEC sp_configure 'show advanced options', 1;
RECONFIGURE;
EXEC sp_configure 'clr strict security', 0;
RECONFIGURE;
EXEC sp_configure 'show advanced options', 0;
RECONFIGURE;
5.2.6 Database Preparation
Login to SQL server as a user with server admin role and execute the following commands to create a database "MM" and a user "MM" with password "MM123!" (or another one):
Enable clr, and create the database and user:
EXEC sp_configure 'clr enabled', 1;
RECONFIGURE;
Go
CREATE LOGIN MM WITH PASSWORD = 'MM123!';
CREATE DATABASE MM;
ALTER DATABASE MM SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
ALTER DATABASE MM SET READ_COMMITTED_SNAPSHOT ON;
ALTER DATABASE MM SET MULTI_USER WITH ROLLBACK IMMEDIATE;
ALTER AUTHORIZATION ON DATABASE::MM to MM;
Warning: The product relies on one assembly (named MIRRepo) which is loaded from binary and not from file. This binary is created with the SAFE permissions. So in addition to being the database owner, the MM user should be granted the CREATE ASSEMBLY permission.
5.2.7 Database Connection
Advanced SQL Server Administrators may define ("hard-code") a set of TCP/IP ports for SQL Server to run over the network. However, Microsoft now recommends to run the "SQL Server Browser" service which can be done either in the Services panel or the SQL Server Configuration Manager.
For more information, read: How to: Configure Express to accept remote connections
The connection string syntax is:
jdbc:sqlserver://<dbServer>:<dbPortNumber>;databasename=<dbName>
To connect to a named SQL server instance other than the default:
-
If the SQL Server browser service is running:
-
If the named instance is configured to listen on dynamic ports:
In the installer, specify only the instance name (in the format HOSTNAME\INSTANCENAME) and no port (the port field should be left empty), such as:
jdbc:sqlserver://localhost\sqlexpress;databaseName=MM; -
If the named instance is configured to listen on static IP ports:
The SQL Server instance must be configured to run on a static TCP/IP port and that port must be specified in the installer, such as:
jdbc:sqlserver://localhost\sqlexpress:1433;databaseName=MM;
-
If the named instance is configured to listen on dynamic ports:
-
If the SQL Server browser service is not running:
In the installer, specify only the instance port, such as:
jdbc:sqlserver://localhost:1433;databaseName=MM;
To connect to SQL Server using domain account:
-
Find the mssql JDBC driver under
$MM_HOME/java/jdbc/mssql
, e.g.mssql-jdbc-7.4.1.jre11.jar
. -
Download a Microsoft JDBC driver for SQL Server with the same version, e.g. 7.4, and extract the content.
There you will find a
sqljdbc_auth.dll
from Microsoft JDBC driver x.x forSQL Server\sqljdbc_x.x\enu\auth\x64
, and amssql-jdbc-x.x.x.jre11.jar
from Microsoft JDBC driver x.x forSQL Server\sqljdbc_x.x\enu
-
Copy the
sqljdbc_auth.dll
to$MM_HOME/bin
and replace themssql-jdbc-x.x.x.jre11.jar
under$MM_HOME/java/jdbc/mssql
with the one from Microsoft JDBC driver x.x forSQL Server\sqljdbc_x.x\enu
-
At the Configure Database Connection window add the string
;integratedSecurity=true
at the end of the Database url, such asjdbc:sqlserver://localhost:1433;databasename=MM;integratedSecurity=true
. Specify other fields and click TEST CONNECTION.
Note 1: The default database instance name for SQL Server Express is "sqlexpress, and "sqlserver" for any other SQL Server edition.
Note 2: The default SQL Server TCP/IP port number is 1433.
5.3 Database on PostgreSQL
Login to an existing database as a database superuser or a user who has CREATEROLE and CREATEDB privileges
psql.exe -h <HOST-NAME> -W -U <USER_NAME> -p <PORT> -d <DATABASE_NAME>
-- Delete previous user if needed
-- DROP USER "MM";
-- If the user cannot be dropped due to any ownership issues, you'll need to reassign those objects to another user
-- REASSIGN OWNED BY "MM" TO <OTHER-USER-NAME>;
-- Or drop those objects
-- DROP OWNED BY "MM"
-- Create a user MM with LOGIN privilege
CREATE USER "MM" LOGIN PASSWORD 'MM123!';
-- Create a database MM with UTF8 encoding.
CREATE DATABASE "MM" WITH OWNER "MM" ENCODING 'UTF8';
Note: For maintenance reasons, PostgreSQL database indexes can be rebuilt as follows:
- Stop the MM Tomcat server
-
run the following command under the MS-DOS command prompt assuming postgresql/bin is in the system path. This may take some time if your database is large.
reindexdb --username=MM --dbname=MM -v - Restart the MM Tomcat server
6. Application Server Setup
6.1 Server Installation and Configuration
The MIMM Application Server is installed as follows:
-
On Linux operating systems, use
tar -xjvf
to extract the software package (.tbz2
format) into the directory of your choice, by default located at:export MM_HOME=/MetaIntegration
Depending on your software installation directory, you may need "root" privileges.
-
On Windows operating systems, use
unzip
to extract the software package (.zip
format) in the directory of your choice, by default located at:set MM_HOME=C:\MetaIntegration
Depending on your software installation directory, you may need "Administrator" privileges. You should avoid using the "Program Files" directories of Windows as they have are now controlled by Windows with special access rights.
Finally, note that this admin document assumes the Linux notation for scripts and file location such as
$MM_HOME/Setup.sh
. The Windows scripts and file location are matching and therefore can be inferred, in the above example:%MM_HOME%\Setup.bat
.
If your are using an existing database and do not wish to customize the application server (e.g. memory allocation, Windows services), then you can skip this step and go directly to the section on Application Server Execution and Initialization
Otherwise, go to the software home directory and execute the $MM_HOME/Setup.sh
utility. Note On Windows, you must "Run as administrator" the %MM_HOME%\Setup.bat
utility.
This setup utility will allow you to setup the configuration parameters defined below through a user friendly application.
After any change on any panel (tab) below, remember to press the Configure button in order to perform the configuration changes.
A dialog box will be issued to confirm success or failure (with error messages).
Alternatively, this setup utility also works at the Windows command line or Linux shell, use the -?
the options:
Some of these MM setup values (and other ones) can also be defined in
[{ -? | --? | /? | --help }] Asking for help
[{ -tp | --tomcat-port }]
[{ -tm | --tomcat-memory }]
[{ -ta | --tomcat-agent }]
[{ -ts | --tomcat-service }]
[{ -s | --ssl }]
[{ -sc | --ssl-cert-file }]
[{ -sk | --ssl-key-file }]
[{ -sp | --ssl-key-password }]
[{ -sr | --ssl-root-chain-file }]
[{ -ds | --database-service }]
[{ -dp | --database-port }]
[{ -dc | --database-connection }]
[{ -du | --database-user }]
[{ -dw | --database-password }]
[{ -da | --database-auto-upgrade }]
[{ -mh | --mail-host }]
[{ -mp | --mail-port }]
[{ -mu | --mail-user }]
[{ -mw | --mail-password }]
[{ -ms | --mail-sender }]
[{ -mx | --mail-external-url }]
[{ -ma | --mail-admin }]
[{ -ep | --erwin-product }]
[{ -op | --oracle-product }]
[{ -ch | --certificate-host }]
[{ -cp | --certificate-port }]
[{ -we | --webapp-enable }]...
[{ -wd | --webapp-disable }]... $MM_HOME/conf/mm.conf.properties
.
-
Database Server tab:
This is to be used only if you wish to use the bundled PostgreSQL database.-
Enable Windows Service
This will create the 'MM Database Server' Windows Service, set it for automatic start, and actually start it. Note that if no database already exists in$MM_HOME/postgresql/data
, then a new database will be created.
Unchecking that box will delete the 'MM Database Server' Windows Service, which is a good idea before uninstalling the MIMM software. Note that your existing database in$MM_HOME/postgresql/data
will not be deleted as side effect. -
Port Number
This is set to 5432 which is the default of PostgreSQL, but can be changed to avoid conflicts with other servers.
-
Enable Windows Service
-
Application Server tab:
-
Enable Windows Service
This will create the "MIMM Application Server" Windows Service, set it for automatic start, and actually start it. Unchecking that box will delete the "MIMM Application Server" Windows Service, which is a good idea before uninstalling the MIMM software. -
Metadata Harvesting Server Only
This allows to setup this application server as a metadata harvesting server only, rather than a full metadata management server. This is very useful in architecture deployments where the metadata management server is:- deployed remotely on the cloud but needs to access metadata harvesting servers (agents) locally on premise, or
- deployed on Linux but needs to access metadata harvesting servers (agents) on a Windows machine where DM/DI/BI client tools are Windows only (e.g. COM based SDK).
-
Metadata Harvesting Browse Path
This controls the access to the file system for metadata harvesting. The default value is set to '*' which means any Windows drive (C: and any mounted remote drive R:) or any directory from root on Linux. It is strongly recommended to limit the access to a common shared data location, and avoid system area. -
Data Directory
This is the location of all data files, including log files as well as the metadata harvesting caching. The data directory is located by default in the 'data' subdirectory of the application server home directory. It is recommended to separate the program data from the program files, this allows you to provide a new location for the data in a separate area (with regular backups if possible). Note that changing to a new location will not move the existing data from the previous location. Either the new location already had the data (from a previous install), or new data will be created. -
Max Memory
This defines the maximum memory used by Java (JRE) on the MIMM Application Server (Apache Tomcat). This is unrelated to the maximum memory used by java on bridges for Metadata Harvesting which is separately set by default with the variableM_JAVA_OPTIONS
in$MM_HOME/conf/conf.properties
, and can be overridden within the Miscellaneous parameter of memory intensive import bridges (e.g. JDBC). -
Port Number
This sets a custom start port number by default to avoid conflicts with other web application servers. Note that the MIMM Application Server uses 2 consecutive ports. However, this can be set back to 80 to avoid having to specify any port number in the URL. -
SSL
This enables Secure Socket Layer (SSL) communication for web access (HTTPS). In order to support HTTPS, the MIMM Tomcat service must be configured to work with HTTPS for encryption of passwords and other content exchanged between the web client and the MIMM Application Server. In this case, you will need a certificate for the HTTPS protocol to work. Note: the MIMM software does not perform any error handling for validating a certificate associated with the MIMM Application Server, and most web browsers will report an error if the certificate is not provided by a valid certificate authority. Thus, your certificate should be a trusted certificate provided to you by a valid Certificate Authority.-
Certificate file
Mandatory - Can be a .pem (privacy enhanced electronic mail), .pfx (Windows personal information exchange) or a .jks (Java keystore) -
Root Certificate file
Optional (only required if the above certificate file was generated by an external company as a certificate authority) -
Key file
Optional (often it is the case that the certificate file also contains the key) -
SSL Key Password
Optional (required if the key is password protected)
-
Certificate file
-
Enable Windows Service
6.2 Application Server Upgrade
6.2.1 Understanding the Data Locations
Most application data is obviously located inside your database server, you are responsible for regular backup of such database. Upgrading your application may also upgrade the associated database content (database schema, stored procedures, indexes and of course data). It is important to understand to understand that part of the MIMM software is implemented as database stored procedures like tracing the lineage. Therefore a given version of MIMM corresponds to a version of the MM software in the application server (tomcat) and a version of the MM software in the database (e.g. tomcat) Therefore make sure you always backup your database before any upgrade.Furthermore, the upgrade process may take several hours (on large repositories) and also need extra space for temp data during the migration. Therefore make sure the database has at least 20% free space.
Finally, it is also important to understand that the software installation directory (known as
$MM_HOME
in this document)
also contains some critical application data and application setup customizations that have to be taken into account in your backup or upgrade process, including:
-
postgresql/data
which contains the actual PostgreSQL database data only on Windows and only when configured with the Setup utility (in the "Database Server" tab). -
data
which contains other application data, including:download/
MIMB/
contains third party download software packages
files/
mimb/
contains files for upgrade of MIMBWebServicesmm/
analytics/
contains the generated analytics data files (new beta feature).backups/
contains model backups for model comparison purpose.operations/
contains the operation execution generated files such as a troubleshooting package or a backup ready to download. As these generated files can be very large, it is critical to execute the operation "Delete operation logs and files" either manually (MANAGE > Repository, right click on at the repository root) or automatically (MANAGE > Schedules).sessions/
contains any temporary files for user login session.tmp/
contains other temporary files.
logs/
for the log filestomcat
andmimb
mimb/
contains the model import/export bridge log files.search/
contains the search query log files.tomcat/
contains the tomcat log files
MIMB/
cache/
contains bridge execution files organized asBridgeId/ImportId/mir_Version,nativeVersion
parameters/
contains bridge execution parameter files (only for some bridges like DI/ETL with runtime variables) organized asBridgeId/ImportId/Parameters
search/
contains the lucene search engine indexes (they will be automatically fully rebuilt from scratch if the folder is empty which will take a lot of time)temp/
for any temporarily files (including from the mimb bridges)webapps/
for application server (tomcat) cache
conf
which contains the MM configuration / customizations organized as follows:
conf.properties
contains file containing most customizations defined with the Setup utility (in the "Application Server" tab)ModelBridgeList.xml
contains the list of enabled bridges and their namesresources/
directory containing any User Interface Customizations, in particularMM.properties
andMetadataExplorer.xml
.Template/
contains the default template files of all the above files/directories, including a potentially new updatedModelBridgeList.xml
orMetadataExplorer.xml
after cumulative patches.
tomcat/conf
with the
tomcat.properties
file containing the tomcat port and memory customizations defined the Setup utility in the "Application Server" tab,
and the keystore
file containing the tomcat SSL certificates defined with the Setup utility (in the "Application Server" tab).
jre/lib/security
which also contains some SSL customizations defined with the Setup utility (in the "Application Server" tab). It is recommended to not reuse such directory, but rather reinstall the SSL keys with the Setup utility.
6.2.2 Upgrade Process
We recommend the following upgrade process:- Stop your MIMM application server in the same way it was started such as stopping the Windows services / Linux daemon, stopping the desktop command windows, or using the tomcat/bin/shutdown. Remove the application server service if one was created (Windows only).
-
Stop your MIMM database server
and ONLY if you are using the bundled PostgreSQL database on Windows then use the
$MM_HOME/Setup.sh
utility to remove the Database Service as follows: Go to the Database Server tab, enable disable or uncheck the database service check box, and click on the Configure button. - Backup your MIMM data including your database and data file directories as explained above.
-
Backup your MIMM software
by copying the
$MM_HOME
directory as$MM_HOME_OLD
. -
Install the complete MIMM new software
(ONLY needed for clean install of a new version)
by deleting the old
$MM_HOME
, and then creating a new one by unzipping the new MIMM full package. -
Apply the latest MIMM software cumulative patch
by unzipping it from inside the new
$MM_HOME
directory.- WARNING 1: Make sure you unzip with overwrite on Windows, and use
unzip -u
on linux to update files while retaining permissions. - WARNING 2: You cannot use cumulative patches for major version upgrades, you must first start from a clean install of the new major GA version.
- WARNING 3: You cannot reverse / unzip an older cumulative patch, you must restart from a clean install of the original GA version.
- WARNING 1: Make sure you unzip with overwrite on Windows, and use
-
Restore your MIMM data and software customization/setup
(ONLY needed for a clean install of a new version) by copying the appropriate files and directories (as previously explained) from
$MM_HOME_OLD
to$MM_HOME
, including at least$MM_HOME/data
and$MM_HOME/conf/conf.properties
but possibly more as used and customized such as$MM_HOME/conf/resources
, or$MM_HOME/tomcat/conf
. -
Integrate the MIMM new software features in your configuration
by copying potential new versions of files from
$MM_HOME/conf/Template
into their matching directories in$MM_HOME/conf/
. For example the new$MM_HOME/conf/ModelBridgeList.xml
may contain some new or updated bridges. WARNING: if you had customized some files such as$MM_HOME/conf/resources/MM.properties
, you must re-apply/merge such customization starting from the new version of that file copied Template. -
Reconfigure your MIMM Database Server
(ONLY if you are using the bundled PostgreSQL database on Windows). As the PostgresSQL server software version may have changed and the database server needs to be upgraded.
Therefore, you must:
-
Execute the (old renamed installation)
$MM_HOME_OLD/Setup.sh
utility to restore the old Database Service as follows: Go to the Database Server tab, enable the database service check box, and click on the Configure button. -
Execute the (new installation)
$MM_HOME/Setup.sh
utility as follows: go to the Database Server tab, enable the database service check box, and click on the Configure button. At this point the$MM_HOME/Setup.sh
utility will retrieve the existing PostgreSQL data from the old install directory (which was already running as a service), will migrate them to the new install directory, and will remove the old PostgreSQL service, before starting the new PostgreSQL service on the new directory.
-
Execute the (old renamed installation)
- Restart your MIMM database server (ONLY if you are using the bundled PostgreSQL database on Windows).
- Restart your MIMM application server after which your first login as Administrator may prompt you for an upgrade of your MM database.
- Redo all above steps for any other MIMM Application server configured as MIMB harvesting agent.
- Update your MIMM repository content (ONLY as needed) if the upgrade contains new and improved import bridges that would require to fully re-import the model (remove incremental harvesting in such case) and this will therefore require to re-build the related Configurations.
Typical DevOps Linux bash scripts include the following commands:
#! /bin/bash
##############
# MM INSTALL #
##############
######## START SETUP ########
sudo su
MM_INSTALL=/opt
MM_INSTALLER=MIMM-1100-20221231
MM_SOFTWARE=MetaIntegration
MM_HOME=${MM_INSTALL}/${MM_SOFTWARE}
######## DOWNLOAD INSTALLER TO INSTALL DIRECTORY ########
# get $MM_INSTALL/$MM_INSTALLER
######## STOP SERVER ########
$MM_HOME/ShutdownServerApplication.sh
# double check for any tomcat process still running
# ps -edf | grep tomcat
# kill -9
######## FULL CLEAN INSTALL NEW SERVER ########
# Backup old install and extract new one
cd $MM_INSTALL
mv $MM_SOFTWARE $MM_SOFTWARE.yyyymmdd
tar -xvf $MM_INSTALLER.tbz2
rm MM_INSTALLER.tbz2
# optional ThirdParty software extract for brand new install on machines with no access to internet
mv *ThirdParty*.zip $MM_SOFTWARE
cd $MM_SOFTWARE
unzip *ThirdParty*.zip
rm *ThirdParty*.zip
######## COPY SETUP AND DATA ########
cd $MM_INSTALL_DIR/$MM_SOFTWARE.yyyymmdd
# data directory with large MM search index and MIMB cache
# cp -r data ../$MM_SOFTWARE
mv data ../$MM_SOFTWARE
# conf properties (memory, path, etc.)
#M_BROWSE_PATH=
#M_JAVA_OPTIONS=-Xmx64G
cp conf/conf.properties ../$MM_SOFTWARE/conf
# tomcat conf properties (URL, ports, services, etc)
cp tomcat/conf/tomcat.properties ../$MM_SOFTWARE/tomcat/conf
######## HARVESTING AGENT SERVERS ########
# conf agent identification to MM server
cp conf/agent.properties ../$MM_SOFTWARE/conf
# conf agent access to SSL based MM server
$MM_HOME/Setup.sh -ch abc.def.com -cp 443
######## MM MAIN SERVER ########
# MM conf properties (database connection, email setup, etc.)
cp conf/mm.conf.properties ../$MM_SOFTWARE/conf
# MM resources such as the UI look & feel
alias cp=cp
cp -rf conf/resources ../$MM_SOFTWARE/conf
# Tomcat SSH Setup
cp tomcat/conf/keystore ../$MM_SOFTWARE/tomcat/conf
cp tomcat/conf/server.xml ../$MM_SOFTWARE/tomcat/conf
######## RESTART SERVER ########
$MM_HOME/RestartServerApplication.sh
######## END ########
6.2.3 Version Specific Upgrade Issues and Recommendations
Upgrading to a new version may have version specific issues or recommendations that are listed at the bottom of the release notes: see Release Changes for more details.6.2.4 Upgrade and Migration Best Practice
The following critical steps represent the best practice in MIMM Server upgrade or migration to a new machine (on prem or cloud)Backup
As with any migration / upgrade process, it is critical to backup the underlying data:- the installation data and conf directories (see Understanding the Data Locations)
- the repository database. (see Database Server Backup/Restore)
Repository Cleanup
One of the most critical first step is to save disk space and speed up performance by performing a major cleanup of the repository:- Apply the latest MIMM and MIMB Cumulative patches (in the MIMM Server and all MIMB Harvesting Agents).
- Make sure that all Configurations are NOT on auto update.
- Stop any scheduled operations, including all automatic metadata harvesting and database maintenance (MANAGE > Schedules).
- Delete all unused / test Directories, Configurations, Models, Mappings, etc. (MANAGE > Repository).
- Delete as many versions as possible (e.g. retaining the last few versions) of the remaining critical Configurations (MANAGE > Repository).
- Delete unused model versions (MANAGE > Repository: Repository object > Operations > Delete unused versions).
- Run the database maintenance to purge all deleted objects from the database (MANAGE > Schedules: Run Database Maintenance). Note that each database maintenance run purges deleted models for only 2 hours, therefore the database maintenance has to run as many times as needed until the log no long shows any models to delete.
Model Imports
In order to avoid surprises after the upgrade/migration, it is critical to make sure that you started with a stable environment. The most important aspect of that is to make sure all imports are working before any upgrade because the source may no longer be available, may not have been imported with the latest version of the bridge, or may simply not have been imported for a long time.- Apply the latest MIMM and MIMB Cumulative patches (in the MIMM Server and all MIMB Harvesting Agents).
- Make sure that all Configurations are NOT on auto update.
- Stop any scheduled operations, including all automatic metadata harvesting and database maintenance (MANAGE > Schedules).
- Delete the model import cache in
$MM_HOME/data/MIMB/cache)
. - Manual full import (no incremental harvesting) of all Models (one by one) with clear import cache and uncheck "Create new versions only when new import has changes".
- Manual force build of all Configurations (one by one) with testing (connection stitching as needed for lineage).
Repository Database
If the MIMM repository database server has to be migrated to a new machine (e.g. from on prem to cloud), make sure you follow the proper the proper database vendor process (see Database Server Backup/Restore).Note that the above process works well as long as you stay with the same database technology (e.g. PostgreSQL), but such database cannot be performed between database technologies, such as Oracle or SQL Server to PostgreSQL because the repository database implementations are different / optimized for each database technology. In such case, one can use the MIMM Application's Backup/Restore format (directories of XML files) but there are known limitations as some content is not back up in that format.
Harvesting Agents
If the MIMM Application Server has to be migrated to a new machine (e.g. from on prem to cloud), Make sure to reconnect each MIMB harvesting agents to the new MIMM Application Server.Note that such MIMB harvesting agents can remain on prem, while connecting to the new MIMM Application Server on cloud.
6.3 Application Server Execution and Initialization
The easiest way to start the MIMM Application Server is to execute the $MM_HOME/RestartApplicationServer.sh
utility. On Windows, you must "Run as administrator" the %MM_HOME%\RestartApplicationServer.bat
utility.
-
On Windows operating systems, you can alternatively use the Windows Services to control the MIMM Application Server by using the
%MM_HOME%\RestartApplicationService.bat
utility instead. This utility will create the Windows Service for the MIMM Application Server, if it was not already created by previous execution of this utility or theSetup.bat
utility. At this point, you can simply use the Windows Services to start, stop or restart the MIMM Application Server automatically.
When running the MIMM Application Server as a Windows Service, it is important to configure the user running such service in order to have full access rights to the needed files and applications. For example, the MIMB bridges involved in the metadata harvesting may need to invoke the SDK of third party software such as the COM based API of CA ERwin, or SAP BusinessObjects Universe Designer.
In order to set such access rights, go to the services manager of Windows, right-click on the MIMM Application Server service. Then, go to the "Log On" tab to define an account by name under which the service will run. -
On Linux operating systems, administrators can use the system daemon directories (e.g.
/etc/init.d/
or/etc/systemd/
) to control the MIMM Application Server (either using theRestartApplicationServer.sh
utility or directly controlling the tomcat server in the home directory).
The final initialization steps of the setup are performed over the web browser as follows:
-
Connection
Connecting to the server on Windows can be simply achieved by opening theMetadata Management
link in the home directory. In all cases, you can connect to the server using your internet browser to open by default: http://localhost:19980/MM. Note that the default port of this URL number may have been changed by theSetup
utility in the section Server Installation and Configuration.. -
Database
Define the connection to the previously created database (in the above steps), by providing the database type, user, password, and URL (JDBC connection). If you are using the PostgreSQL database bundled with the software package for Windows, then all these parameters should be already preset. PressTest Connection
button to verify proper database connectivity. Finally, when the pressing theSave
button, the MIMM Application Server will create all the necessary tables in the database. -
License
Click on theDownload License Information
link to obtain the obtained yourHostInfo.xml
file that should be sent with your license request. Warning: Make sure your are NOT connected to any VPN during that step, then your license will work independently of your VPN connection. After you have received yourMM.lic
license file, browse for it and click on theSave License button
. Alternatively, the host info file can be generated at the Windows command line using$MM_HOME/bin/HostFileGen.sh
, orHostFileGen.bat
on Windows. This will generate yourHostInfo.xml
file (in the bin directory) that should be sent with your license request. After you have received yourMM.lic
license file, copy it to$MM_HOME/conf/MM.lic
and restart the application server. -
Login
Login as "Administrator" with password "Administrator". Note that you should change that password later in the application by going to:Tools -> Administration -> Users
)
6.4 Custom integration with authentication environments
MIMM is able to support three authentication methods:- Native Authentication, where the password is managed by the software and stored within the database.
- LDAP Authentication, where the software does not manage or store the LDAP passwords at all. Instead, it is simply passed it through to LDAP in order to authenticate.
- External Authentication such as Single Sign On (SSO), where the software does not perform any authentication, and leaves that responsibility to a local single sign on service managed by the customer.
In Tools->Administration->Users one may specify either:
- Mixed Native and LDAP authentication where users may be authenticated either as native users or LDAP users
- External authentication where the system does not perform any authentication, leaving it up to a local Single Sign On environment.
6.4.1 Native Authentication Configuration Issues
There are no specific configuration steps for Native Authentication.6.4.2 LDAP Authentication Configuration Issues
There are no special server configuration issues for LDAP Authentication. LDAP connectivity configuration is documented in the online help.6.4.3 Windows Authentication Issues
It is also possible to enable the Application Server to obtains authentication for users from Windows authentication via the browser (client). This way, users will automatically be authenticated if they are running from a Windows session. To do so, one must install a third party product named Waffle (Windows Authentication Functional Framework) as an addon (see here):- Please ensure that all LDAP settings are correct and users are able to log into the product via LDAP authentication. LDAP connectivity configuration is documented in the online help.
- Unzip the Waffle zip.
- Copy all the jar files from it to
$MM_HOME/tomcat/lib
- Open
$MM_HOME/tomcat/conf/web.xml
. Search for "Windows authentication support". Uncomment the block following that. - Restart MIMM.
- You should have windows authentication enabled now. Any valid windows user will be logged in as guest by default as long as licensing allows it. If you need to get an administrator interface, you can access: http://host:port/MM/Auth?nativeLogin (optionally you can force a redirect to either &redirectTo=/MM/Explorer or /MM/Manager)
- Provide connection information for the database you created above.
6.5 Custom integration for Secure Socket Layer (SSL) communication
6.5.1 Configuring SSL for HTTPS access from Web Clients
The recommended method of configuring HTTPS access to Web Clients is using the -ssl
options of the Setup
utility
as explained in Server Installation and Configuration.
[{ -s | --ssl }]
[{ -sc | --ssl-cert-file }]
[{ -sk | --ssl-key-file }]
[{ -sp | --ssl-key-password }]
[{ -sr | --ssl-root-chain-file }]
For example:
$MM_HOME/Setup.sh -s true -sk MyPrivateKeyFile -sp MyPrivateKeyPassword
If the above method fails, you may manually update the server keystore location and password,
by editing $MM_HOME/tomcat/conf/server.xml
to change the value of certificateKeystoreFile
and certificateKeystorePassword
within the <Certificate>
section.
6.5.2 Configuring SSL to access Remote Servers
There are multiple use cases of using SSL to access remote server:
- Configuring HTTPS for Remote Metadata Harvesting Agents (Remote MIMM Server)
- Configuring SSL for Harvesting Metadata with API based bridges (e.g. a database server via JDBC)
- Configuring SSL for Repository Storage (database server via JDBC)
- Configuring LDAPS for Enterprise Directory
However, when using a self signed certificate, then such a certificate needs to be explicitly imported in the java environment of your server.
The recommended method to import such a certificate is using the -certificate
options of the Setup
utility
as explained in Server Installation and Configuration.
[{ -ch | --certificate-host }]
[{ -cp | --certificate-port }]
For example:
$MM_HOME/Setup.sh -ch MyServer.MyDomain.com -cp 443
If the above method fails, you may manually import the certificate into the java environment keystore as follows:
cd $MM_HOME/jre/lib/security
mv jssecacerts jssecacerts.old
$MM_HOME/bin/keytool -importkeystore -srckeystore YourSelfSignedCertificate -keystore jssecacerts
$MM_HOME/RestartServerApplication.sh (or RestartServerService.bat on Windows)
Note that the above import steps have to be repeated for the self signed certificate of every remote servers.
6.6 Security and Vulnerability Considerations
For extra security and vulnerability reasons, the Apache Tomcat bundled within MIMM can be updated as follows:
6.6.1 Tomcat Upgrade To The Current Patches
Any new version MIMM includes the latest version of Tomcat for bug fixes, performance, security and vulnerability reasons.
According to the tomcat 9 changelog
https://tomcat.apache.org/tomcat-9.0-doc/changelog.html,
the last Common Vulnerabilities and Exposures (CVE) fix was made in 9.0.30.
Therefore, there is no real benefit to upgrade to the latest version.
In addition, it is not recommended that you manually install / patch the version of Tomcat bundled in MIMM,
as Tomcat new versions may not be compatible and require more files to be copies.
The following process describe the patch process anyway although not officially supported:
-
Check the version of Apache Tomcat that was bundled within your MIMM (e.g. Tomcat 9.0.45) which is expressed in:
$MM_HOME/Documentation/License/MIMM-ThirdParty-LICENSES.html
and
$MM_HOME/tomcat/RELEASE-NOTES
-
Download the latest patch version (e.g. 9.0.46) within the same major version:
https://tomcat.apache.org/download-90.cgi - Unzip in temporary directory
- Stop Tomcat
- Copy the lib directory from that temporary directory to the
$MM_HOME/tomcat/lib
(overwriting files if necessary) - Restart Tomcat
6.6.2 Tomcat Check For Allowed Referrer
For additional protection, we recommend enabling Tomcat to check for allowed referrer:
-
Edit
$MM_HOME/tomcat/conf/web.xml
- Uncomment the two filter sections in the 'Checks referrer is allowed' section. The variable
${server.fqdn}
will be substituted with the value ofM_SERVER_FQDN
intomcat.properties
- Uncomment the two filter sections in the 'Checks referrer is allowed' section. The variable
-
Edit
$MM_HOME/tomcat/conf/tomcat.properties
by:- Changing the
M_SERVER_FQDN
variable fromlocalhost
to<myMMServer.myDomain>
- Changing the
6.6.3 Tomcat Configuration to avoid TLS violation of Cryptographic Standard STD-IT-0005
MIMM uses Java 11 that supports Transport Layer Security (TLS) 1.3, but it is compiled for Java 8 that supports TLS 1.2. To ensure that Tomcat does not "drop down to using" a lower version (1.1. and 1.0), one must configure Tomcat appropriately.
For more details on configuring Tomcat SSL, see Tomcat's documentation at:
https://tomcat.apache.org/tomcat-9.0-doc/config/http.html#SSL_Support
6.6.4 Tomcat Configuration to include HTTP security headers
For additional protection, you can edit $MM_HOME/tomcat/conf/web.xml
By default, the application sets the following to the recommended values:
* Content-Security-Policy
* X-Content-Type-Options
* X-XSS-Protection
The X-Frame-Options is not set by default, it can be done manually by adding the following fragment:
<init-param>
<param-name>X-Frame-Options</param-name>
<param-value>sameorigin</param-value>
</init-param>
The HSTS headers are not necessary as when the application is configured for HTTPS then HTTP is not allowed at all, and do not provide automatic redirection.
However, you may want/need to add it, you can do so manually by adding the following fragment:
<init-param>
<param-name>Strict-Transport-Security</param-name>
<param-value>max-age=31536000; includeSubDomains</param-value>
</init-param>
6.6.5 Tomcat Configuration for Access Control with Valve
The first line of defense of the Tomcat based primary MIMM server or any of its metadata harvesting agents / MIMB servers requires a proper firewall setup. As a last line of defense, the tomcat remote address valve can be configured for access control.
In such case the Tomcat of any metadata harvesting agents / MIMB server should configure with a valve to only accepts requests from its associated primary MM server.
This requires to edit the file tomcat/conf/MetaIntegration/localhost/MIMBWebServices.xml
in order to add the following line:
<Valve className="org.apache.catalina.valves.RemoteAddrValve" allow="127\.\d+\.\d+\.\d+|::1|0:0:0:0:0:0:0:1"\>
For more details on configuring Tomcat access control with valve, see Tomcat's documentation at:
https://tomcat.apache.org/tomcat-9.0-doc/config/valve.html#Remote_Address_Valve
6.6.6 Secret / Password Encryption
There are a few cases where any account secret / user password is stored in the MIMM repository database using an encryption method that is two-way in order to restore the original password just before calling a third part API later:
- When configuring metadata harvesting (Model > Import > Setup), some bridge parameters require authentication to the source technology / server (e.g. user / password of a database or a BI server)
- When configuring LDAP based authentication (MANAGE > Users > LDAP)
- When configuring Email notification (MANAGE > Email Notification)
- When configuring Cloud Identity (MANAGE > Cloud Identity)
MIMM instead stores such user/password in the repository database (i.e. at rest) using a confidential proprietary reversible encryption algorithm based upon industry standards.
NOTE 1: A second level of encryption can also be used during transport (i.e in motion) using 6.5 Custom integration for Secure Socket Layer (SSL) communication
-
HTTPS for remote metadata harvesting from the main MIMM Server and a remote Harvesting Agent / Server.
See 6.5.1 Configuring MIMM to securely connect via HTTPS to another MIMM server for Metadata Harvesting - LDAPS for authentication to the Enterprise Directory.
- When using LDAP based authentication.
- See 6.5 2 Configuring MIMM to securely connect via LDAPS to the Enterprise Directory
NOTE 2: Alternative secret / password encryption and external storage solutions are available using Cloud Identity and Cloud Secret Vaults (such as Amazon Web Services, Microsoft Azure, or Google Cloud).
See, MANAGE > Cloud Identity
6.7 Lucene Search Engine Troubleshooting
The MM search capabilities are implemented by a Lucene search engine with indexes located in in $MM_HOME/data
If this Lucene search index directory has been lost, the MM server will automatically recreate it.
If the Lucene search index has been corrupted for any reason (e.g power outage during indexing, out of memory, concurrent write to the index, etc.),
then the search index directory can be deleted and the MM server will automatically recreate it.
Although not officially supported, the Administrator can attempt to use use the Lucene CheckIndex to exorcise corrupted documents from the index.
Here are the steps you can following (replace lucene-xxxxxxxx with the actual directory name of your search index).
cp -R $MM_HOME/data/search/lucene_xxxxxxxx /backup
cd /tmp
mkdir CheckIndex
cd CheckIndex
$MM_HOME/jre/bin/jar -xvf $MM_HOME/tomcat/webapps/MM.war
cd WEB-INF
java -classpath "lib/*" -ea:org.apache.lucene... org.apache.lucene.index
- CheckIndex
$MM_HOME/data/search/lucene_xxxxxxxx
- Examine the output of the above command to see if there is any corrupted segment. If there is run the same command above with an extra option "-exorcise".
- You may delete the CheckIndex directory after you are done.
6.8 High Availability Considerations
MM can be deployed under High Availability (HA) clusters based on a configuration of Active/Passive MM servers. In this HA architecture, all user URL requests are connected to a centralized HA Management server (machine / node 1) which redirects all calls to the Active MM server (machine / node 2) , or the Passive MM server (machine / node 3) if the Active MM server (or the machine itself ) is down. In such switch-over case, the delay is limited to the time it takes to startup the Passive MM server which is usually a couple of minutes assuming the machine was already active.
In the above HA architecture, both the Active and Passive MM servers must be sharing the same data on a shared database server, and on a shared file system server.
For more details, see Understanding the Data Locations.
In order to achieve an overall High Availability (HA) of the total MM solution, these shared database and file system servers must also have their own HA deployments,
therefore involving a lot more machines / nodes for each of these database and file system servers.
Starting a MM server starts both the Tomcat web application server and the Lucene search engine,
therefore file system sharing is very important for the user experience as the response time is heavily driven by the License search engine storing its indexes on that shared file system in $MM_HOME/data
).
Note that the HA deployment cannot be achieved with the bundled PostgreSQL database $MM_HOME/postgresql
available with the MM entry level edition on Windows.
In fact, HA deployment is only available on the MM most advanced edition as the license stored in the shared database must be enabled to support both the Active and Passive servers (obviously on different host ids).
7. Metadata Harvesting Model Bridge (MIMB) Setup
The Metadata Integration or Metadata Harvesting from third party databases, data modeling, data integration or business intelligence tools is performed by the integrated Meta Integration® Model Bridge (MIMB) software. By default, the installer software deploys and configures both MIMM and MIMB on the same machine, where the MIMM Application Server accesses the MIMB Web Services locally. MIMB can also be installed and configured as a remote MIMB Agent on another machine. This is very useful in architecture deployments where the metadata management server is:
- deployed remotely on the cloud but needs to access metadata harvesting servers (agents) locally on premise, or
- deployed on Linux but needs to access metadata harvesting servers (agents) on a Windows machine where DM/DI/BI client tools are Windows only (e.g. COM based SDK).
Essential customizations (e.g. directories, memory) of the MIMB Application Server can be performed in the following configuration file:
$MM_HOME/conf/conf.properties
Recommended customizations include:
-
M_BROWSE_PATH
to browse local and mapped network drive.All metadata harvesting file and directory parameter references are relative to the server. The reason is that the server must have access to these resources anytime another event (e.g., scheduled harvest) is to occur. When harvesting a model, then, the UI presents a set of paths that may be browsed in order to select these files and directories. Setting the
M_BROWSE_PATH
parameter allows one to define which drives and network paths will be available in the UI. One may update theM_BROWSE_PATH
using the UI (on the application server) presented by the$MM_HOME/Setup.sh
utility (see also Application Server Execution and Initialization), or by editing the$MM_HOME/conf/conf.properties
file directly.On installation, the set includes all directly attached drives., which is specified by an asterisk "* as follows M_BROWSE_PATH=*.
Note for Windows based application servers: When running as a service, the drive names (mapped) and paths may not be the same as what a user sees when logged in, and thus the "*" value will not be see all drives you might expect when selecting drives using the UI. Instead, one must explicitly list all the drives and network paths that one wants to be available to all users in the UI. Also, it is not sufficient to simply enter the mapped drive id (e.g., "N:\"), as that drive mapping is also generally not available to services. Thus, one should specify the physical drives by letters, but must specify the network paths completely, such as
M_BROWSE_PATH=C:\, E:\, \\network-drive\shared\
Note that the above also applies even to script backup and restore drives.
-
M_DATA_DIRECTORY
to relocate the data such as the log files, and metadata incremental harvesting cache as needed for very large DI or BI tools. -
M_JAVA_OPTIONS
to increase the maximum memory used by java bridges during the metadata harvesting of very large DB, DI or BI tools. Note that this parameter defines the default maximum for all java bridges, however most memory intensive java bridges (e.g JDBC bridges) have the ability to define its own maximum memory in their last parameter called Miscellaneous.
When the MIMB Application Server is used a local metadata harvesting agent connected to a MIMM Application Server on the cloud,
the additional customizations are needed in the $MM_HOME/conf/agent.properties
configuration file where:
-
M_SERVER_URL
is the URL of the MIMM Application Server on the cloud such asM_SERVER_URL=http://server:19980/MM
. -
M_AGENT_NAME
is the agent name such asM_AGENT_NAME=MyCompanyOnPemise
that the above MIMM Application Server will then use to refer to this metadata harvesting server agent.
8. User Interface Look & Feel Customization
8.1 Login and Headers
Customize the following files and directories using the embedded instructions (in comments):
$MM_HOME/conf/resources/MM.properties
$MM_HOME/conf/resources/FavIcon.ico (can be .png)
$MM_HOME/conf/resources/LoginLogo.svg (can be .png)
$MM_HOME/conf/resources/HeaderLogo.svg (can be .png) with the following requirements:
- Height must be exactly 27px (required)
- Ideally single color (recommended)
- Color should match miti.mimm.header.fontcolor (recommended)
- Color must be compatible (look nice) with selected miti.mimm.header.bgcolor (recommended)
$MM_HOME/conf/resources/web (optional advanced only)
8.2 Metadata Explorer for Business Users
Customize the following files using the embedded instructions (in comments):
$MM_HOME/conf/resources/MetadataExplorer.xml
9. REST API SDK
The REST API SDK documentation is available within the UI by going to the Help menu (top right corner) under Help on REST API, or go directly to: http://localhost:19980/MM/REST-API/.
10. Database Server Backup/Restore
The MIMM Application Server requires the connection to an existing Database server for metadata storage (metadata repository). Metadata Integration recommends to backup the MIMM metadata repository regularly and especially before any upgrade.
This document describes the commands and instructions to perform the MIMM repository database backup and restore tasks. These database commands and instructions assume the following by default:
- Database Name: MM
- database User: MM
- Database Password: MM123!
We assume that objects from backup are restored into the same database later.
Always use the same database software version to perform the backup and restore. Backups created by more recent version of a database server may not be restored in earlier versions of the database server.
Note that any data that is saved after a backup is done will be lost if you restore the backup.
Stop the MIMM Application Server before you perform the backup and restore tasks. Restart the MIMM Application Server afterwards.
To ensure the optimal MIMM Application Server performance after a restore operation, run the database maintenance script in the MIMM Management UI using Tools → Administration → Schedules → Run Database Maintenance to update the database statistics.
Backup and restore on a very large MIMM repository database may take a long time. Refer to the database backup and restore documentation to enable parallelism, incremental backup and restore for better performance. The instructions given below are for a full database backup and restore.
10.1 Database on Oracle
We can use the Oracle Data Pump technology for Oracle 10g, 11g, 12c, 18c and 19c database.
First create an Oracle directory BACKUP_DIRECTORY that points to an operating system directory on the database server machine for reading and writing files.
Assume ORCL is the database server name or SID.
Sqlplus.exe / as sysdba
CREATE OR REPLACE DIRECTORY BACKUP_DIRECTORY as '/backups/Oracle';
GRANT read, write ON DIRECTORY BACKUP_DIRECTORY TO MM;
Then use Oracle Data Pump (expdp, impdp) to backup and restore the MIMM metadata repository.
Backup
To backup (export) the MIMM metadata repository database to a file MM.dmp and write the export log to expdpMM.log in the operating system directory /backups/Oracle:
expdp MM/MM123! schemas=MM directory=BACKUP_DIRECTORY dumpfile=MM.dmp logfile=expdpMM.log
Restore
Before you restore the backup to the MIMM repository database, you need to drop the schema MM to delete existing objects and data from the MIMM repository database. Restore will recreate the schema MM.
Sqlplus.exe SYS@ORCL as SYSDBA
DROP USER MM CASCADE;
To restore (import) the MIMM metadata repository database from a file MM.dmp and write the import log to impdpMM.log in the operating system directory /backups/Oracle:
Impdp schemas=MM directory=BACKUP_DIRECTORY dumpfile=MM.dmp logfile=impdpMM.log
When prompted for Username, enter / as sysdba.
You may refer to the Oracle Data Pump documentation for more details on the expdp and impdp commands. Backup Restore using Recovery Manager (RMAN) You may also use the Oracle Recovery Manager (RMAN) to backup and restore your MIMM repository database. It is a good practise to create a separate table space for MIMM repository database and restore only from that table space. For more information refer to the Oracle Database Backup and Recovery User’s Guide.
10.2 Database on SQLServer
To perform backup or restore log in to the SQL Server Management Studio, open a new query window and execute the backup or restore commands given below. You can also use the SQL Server Management Studio Object Explorer UI to perform the backup and restore tasks.
Backup
To backup the MIMM repository database MM into a file /backups/SQLServer/MM.bak, use the following command:
BACKUP DATABASE [MM] TO DISK = N'/backups/SQLServer/MM.bak' WITH NAME = N'MM-Full Database Backup'
GO
Restore
To restore the file /backups/SQLServer/MM.bak into the MIMM repository database MM use the following commands.
USE [master]
GO
ALTER DATABASE MM SET SINGLE_USER WITH ROLLBACK IMMEDIATE
GO
RESTORE DATABASE [MM] FROM DISK = N'/backups/SQLServer/MM.bak' WITH REPLACE
GO
You may refer to the Microsoft SQL Server backup and restore documentation for more details on the above commands.
10.3 Database on PostgreSQL
Backup
To backup the MIMM repository database MM into a file /backups/PostgreSQL/MM.dmp use the following command:
pg_dump -b -f /backups/PostgreSQL/MM.dmp -F t -d "MM" -h localhost -w -p 5432 -U MM
Restore
To restore the file /backups/PostgreSQL/MM.dmp into the MIMM repository database MM use the following command. All database objects and data are dropped before recreating them.
pg_restore -c -F t -d "MM" -h localhost -w -p 5432 -U MM --if-exists /backups/PostgreSQL/MM.dmp
You may refer to the PostgreSQL pg_dump and pg_restore documentation for more details on the above commands.