|
||||||||
|
||||||||
| Vendor | Microsoft |
| Tool Name | Office Excel |
| Tool Version | 97 to 2007 |
| Tool Web Site | http://office.microsoft.com/excel/ |
| Supported Methodology | [Metadata Management] ETL (Source and Target Data Stores, Transformation Lineage, Expression Parsing, Expression Conversion) via OLE API on Excel 97 to 2003 (XLS), Excel 2007 (XLSX), or Open XML Excel (XLSM) File |
Import tool: Microsoft Office Excel 97 to 2007 (http://office.microsoft.com/excel/)
Import interface: [Metadata Management] ETL (Source and Target Data Stores, Transformation Lineage, Expression Parsing, Expression Conversion) via OLE API on Excel 97 to 2003 (XLS), Excel 2007 (XLSX), or Open XML Excel (XLSM) File from Import bridge: 'MicrosoftExcel' 6.1.0 - Mar 2 2010 19:16:52
This bridge allows to import metadata expressed in Microsoft Office Excel spreadsheets, which may contain anything from simple data models to complex mappings (where various Excel sheets are used to define the source data models, target data model, and the mapping between them) along with reusable transformation rules. The imported spreadsheet must be provided in XLSX format. The 'File' parameter should contain the name of this spreadsheet, referred to as the 'metadata source spreadsheet.'
The metadata source spreadsheet must be provided in a specific format referred to as the 'Standard' format. As an example of this format and how to populate it, there is a Standard-Blank.xlsx file with a complete set of the possible worksheets and column headers with definitions describing the expected contents and their meaning. By default, this file is in the directory at $MetaIntegration\conf\MIRModelBridgeTemplate\MIRMicrosoftExcel\.
Also provided, is a an Excel 'Add-In' based 'ribbon' or menu to help navigate and manage spreadsheets which are designed using the standard format. These functions include:
- Inserting new worksheets in the 'Standard' format
- Show and hide functions to control which columns are displayed
- Annotation, formatting and validation of entries on a worksheet
In order to activate these options in Microsoft Office Excel, one must install the add-in provided with the software. Such add-ins are provided for Microsoft Office Excel 2007 and 2003. Instructions for installation are provided in the exported spreadsheet in the comment for the 'how to use' cell on all worksheets of type 'Models'.
ASSUMPTIONS AND LIMITATIONS
Productivity tools, such as spreadsheet editing tools, are commonly used to capture, document or publish physical, logical and business metadata throughout the industry. A tool like Microsoft Office Excel is an especially common choice as it is nearly always available. It is understood by and familiar to business users, offers simple copy and paste compatibility with other Office applications, and conveniences such as spelling correction. Microsoft Excel import and export capabilities are provided to leverage these advantages.
1. All metadata source spreadsheets must be defined in the Standard format, as this is the only format directly supported. To accommodate a legacy migration activity involving a very large number of existing spreadsheets, the format expected by the bridge is not hard coded but rather defined by external 'MetaMap' and 'Blank' spreadsheets specifying how each spreadsheet cell relates to a neutral repository metamodel. This flexibility is available only as a specific customized bridge which may be provided through consulting services.
2. While both import (reading from spreadsheet) and export (writing to spreadsheet) capabilities are offered, they are not designed to support round-trip re-engineering. For example, one may forward engineer the mapping requirements into a DI/ETL tool for design, but the purpose of exporting from that DI/ETL tool back to Excel is to provide a mapping lineage summary independent of all internal multiple complex transformations and not to re-engineer the design and re-import.
3. Data Design and Architecture tool models may support round-trip type editing within Excel, to take advantage of the spell-checking, column and row based copy and paste, and other conveniences inherent to a spreadsheet environment. In this case, one should export to a spreadsheet, make the edits and then re-import into the original design tool, leveraging the compare and merge facilities provided with the receiving tool.
4. New source or target data stores may be exported to an existing spreadsheet. However, no merge or update features are offered at this time. Also, when importing an Excel spreadsheet into a DI/ETL tool, the receiving tool is responsible for integrating/merging the models and mappings within its existing versions.
5. Specific assumptions and limitations are provided for each pre-defined format in the appropriate $MetaIntegration\conf\MIRModelBridgeTemplate\MIRMicrosoftExcel\*-Blank.xlsx file
FREQUENTLY ASKED QUESTIONS
Q: How does one handle the error: 'Cannot open file. File may be locked by another process or is protected or Vista prevents Excel API from opening a spreadsheet when called by a service.'?
A: This error only occurs with .XLS files, as the bridge must use the Excel Office API to parse them. One should ALWAYS save such a file as a .XLSX file and import that file.
Q: How does one handle errors related to Excel memory use
A: This error only occurs with .XLS files, as the bridge must use the Excel Office API to parse them. One should ALWAYS save such a file as a .XLSX file and import that file.
Q: How does the bridge handle 'implicit joins'?
A: When a transformation is defined that requires the joining of two or more tables, a join should be specified on a 'Joins' type worksheet. If there are no entries on any 'Joins' type worksheet or if there are no worksheets of that type, then the bridge will attempt to generate the corresponding join specifications required. In many cases, the bridge will not have sufficient information to create the correct join and will instead generate a proposed join that must be corrected in the export tool.
On the other hand, if there are any entries (join specifications) on any 'Joins' type worksheets, the bridge assumes that all joins are fully specified. If an implicit join is discovered (a transformation requires a join but one has not been specified for it) that implicit join is considered an error and the bridge will not import the model.
Q: Why do I see errors such as Table not found, Schema not found, or Model not found.
A: This is an example of a limitation for the parsing of the XLS format, oftentimes seen with larger models. One should ALWAYS save such a file as a XLSX file and import that file.
| Parameter Name | Description | Type | Values | Default | Scope | |||||
| File | This bridge reads an Excel spreadsheet in XLSX, XLSM, XLS or XML format and creates a model from its textual information. The "File" parameter should contain the name of this first spreadsheet, generally referred to as the "metadata import spreadsheet." This metadata import spreadsheet MUST be formatted to match the structure specified by the "Metadata mapping spreadsheet" spreadsheet parameter. Please find samples of these metadata import spreadsheets in $MetaIntegration\conf\MIRModelBridgeTemplate\MIRMicrosoftExcel\BlankFiles\. Using these files as guides, you may simply create a metadata import spreadsheet with your own model information that matches these examples. |
FILE |
|
Mandatory | ||||||
| __Incremental harvesting date | If a date is set, only metadata created or updated since the date will be imported. Use the ISO 8601 date format: yyyy-MM-dd HH:mm:ss | DATE | Hidden |
| Meta Integration Repository (MIR) Metamodel (based on the OMG CWM standard) |
"Microsoft Office Excel" Metamodel |
Mapping Comments |
| Attribute | Attributes, SourceAttributes or TargetAttributes | |
| Comment | Attribute Comment | |
| Description | Attribute Description | |
| InitialValue | Attribute Initial Value | |
| Name | Attribute Name | |
| Optional | Is Attribute Optional | Is Attribute Required |
| PhysicalName | Attribute Physical name | |
| Position | Attribute Position | |
| Class | Classes, SourceClasses or TargetClasses | |
| Comment | Class Comment | |
| Description | Class Description | |
| Name | Class Name | |
| PhysicalName | Class Physical name | |
| ClassifierMap | Computed based on feature maps | |
| DataAttribute | Computed based on feature maps | |
| DataSet | Computed based on Transformation | |
| DatabaseCatalog | DataPackages, SourceDataPackages or TargetDataPackages | Created based on the "System Type" |
| Comment | Package Comment | |
| Description | Package Description | |
| Name | Package Name | |
| PhysicalName | Package Physical name | |
| SystemType | System Type | |
| DatabaseSchema | Schemas, SourceSchemas or TargetSchemas | Created based on the "System Type" |
| Comment | Schema Comment | |
| Description | Schema Description | |
| Name | Schema Name | |
| PhysicalName | Schema Physical name | |
| DerivedType | Create using "Data Type Inherited From" or calculated using template data. | |
| Comment | Data Type Comment | |
| Description | Data Type Description | |
| InitialValue | Data Type Initial Value | |
| Length | Data Type Length | |
| LowerBound | Data Type Min Value | |
| Name | Data Type Name | |
| PhysicalName | Data Type Physical Name | |
| Scale | Data Type Scale | |
| UpperBound | Data Type Max Value | |
| UserDefined | Calculated based on the datatype usage | |
| DesignPackage | DataPackages, SourceDataPackages or TargetDataPackages | |
| Comment | Package Comment | |
| Description | Package Description | |
| Name | Package Name | |
| PhysicalName | Package Physical name | |
| FeatureMap | FeatureMaps | Computed based on source and target attribute references. |
| Description | Feature Map Description | |
| Name | Feature Map Name | |
| Operation | Feature Map Operation | |
| OperationDescription | Feature Map Operation Description | |
| Model | Models | |
| Author | Model Creation Time | |
| Comment | Model Comment | |
| CreationTime | Model Creation Time | |
| Description | Model Description | |
| ModificationTime | Model Creation Time | |
| Modifier | Model Creation Time | |
| Name | Model Name | |
| PhysicalName | Model Physical name | |
| PropertyType | Defined in the template | |
| DataType | Defined in the template | |
| InitialValue | Defined in the template | |
| Name | Defined in the template | |
| Name | Defined in the template | |
| Value | Defined in the template | |
| Transformation | Transformations | |
| Description | Transformation Description | |
| Name | Transformation Name | |
| Operation | Transformation Operation | |
| OperationDescription | Transformation Operation Description | |
| TransformationTask | TransformationTasks | |
| Description | Transformation Task Description | |
| Name | Transformation Task Name | |
| TypeValue | DataTypes, SourceDataTypes or TargetDataTypes | |
| Description | Data Type Enumerated Value Description | |
| Name | Data Type Enumerated Value |
Last updated on Tue, 2 Mar 2010 19:24:11