Glossary
This glossary defines key terms used throughout the MetaKarta Semantic Language (MSL) documentation.
Core Concepts
Semantic Layer
A business-friendly abstraction layer that sits between raw data sources and analytical tools, providing consistent definitions, calculations, and business logic.
Semantic Model
A logical representation of business data that defines how facts and dimensions relate to each other, enabling consistent analysis across an organization.
Semantic Table
A logical collection of metrics and dimensions bound to specific columns in source database tables. Semantic tables serve as reusable components that can be referenced by multiple models.
Data Warehouse Concepts
Fact Table
The central table in a star or snowflake schema containing quantitative data (measures/metrics) about business events or transactions. Facts typically contain foreign keys to dimension tables and numeric values that can be aggregated.
Dimension Table
A table containing descriptive attributes that provide context for facts. Dimensions answer "who," "what," "where," and "when" questions about the data.
Star Schema
A database schema design where a central fact table is surrounded by dimension tables, forming a star-like pattern. Each dimension table is directly connected to the fact table via foreign key relationships.
Snowflake Schema
An extension of the star schema where dimension tables are normalized into multiple related tables, creating a snowflake-like pattern with branching relationships.
Metrics and Measures
Metric
A quantifiable business measure calculated by aggregating one or more columns from fact tables. Metrics represent key performance indicators (KPIs) such as "Total Revenue" or "Average Order Value."
Fact (or Base Metric)
A row-level numeric attribute in a fact table representing a specific business event or transaction value, such as individual sales amounts, quantities, or costs. Facts serve as building blocks for metrics.
TBD: difference between fact and metric: aggr funcs?
Measure
A synonym for metric, commonly used in OLAP and BI contexts. In MSL, "metric" is the preferred term.
Additive Metric
A metric whose values can be summed across all dimensions. Example: Total Sales can be summed across time, geography, and product.
Non-Additive Metric
A metric whose values cannot be meaningfully summed across dimensions. Example: Distinct Count of Customers cannot be summed across time periods.
Semi-Additive Metric
A metric that can be summed across some dimensions but not others. Example: Account Balance can be summed across customers but not across time.
Dimension Types
Normal Dimension
A dimension sourced from a single database table, used in classic star schema designs. Can be either standard or time-based.
Time Dimension
A special dimension representing time periods (years, quarters, months, days, etc.), often used for time-series analysis.
Snowflake Dimension
A dimension composed of multiple normalized tables with hierarchical relationships. Example: Customer → City → State → Country.
Outrigger Dimension
A secondary dimension referenced from another dimension table. Example: A Product dimension that references a Brand dimension.
Role-Playing Dimension
A single dimension table used in multiple contexts within the same model. Example: A Date dimension used as both "Order Date" and "Ship Date."
Degenerate Dimension
A dimension attribute stored directly in the fact table rather than in a separate dimension table. Example: Order Number or Invoice Number.
Junk Dimension
A dimension combining multiple low-cardinality flags and indicators to simplify the schema and reduce the number of dimension tables.
Slowly Changing Dimension (SCD)
A dimension that tracks historical changes to attribute values over time using various techniques:
- Type 1: Overwrite old values
- Type 2: Create new rows with version history
- Type 3: Add columns for previous values
Many-to-Many Dimension
A dimension where a single fact can relate to multiple dimension members. Implemented using a bridge or junction table. Example: A sale associated with multiple sales representatives.
Hierarchies and Levels
Hierarchy
An ordered arrangement of dimension attributes that defines drill-down paths for analysis. Example: Year → Quarter → Month → Day.
Level
A single position within a hierarchy representing a specific granularity of data. Example: "Month" is a level in a time hierarchy.
Drill-Down
Navigating from a higher level of a hierarchy to a more detailed level. Example: From Year to Quarter.
Roll-Up
Navigating from a lower level of a hierarchy to a more aggregated level. Example: From Day to Month.
Relationships and Keys
Join
A relationship definition between two semantic tables, typically representing a many-to-one relationship from a fact table to a dimension table.
Primary Key
A column or set of columns that uniquely identifies each row in a table.
Foreign Key
A column or set of columns in one table that references the primary key of another table, establishing a relationship between the tables.
Unique Key
In MSL, a definition of columns that uniquely identify rows in a semantic table. A table can have multiple unique keys, with one serving as the default for joins.
Bridge Table
A table used to resolve many-to-many relationships between facts and dimensions, containing foreign keys to both tables.
OLAP Concepts
OLAP (Online Analytical Processing)
A category of software technology that enables analysts to interactively analyze multidimensional data from multiple perspectives.
MOLAP (Multidimensional OLAP)
An OLAP approach where data is stored in pre-calculated multidimensional cubes, providing fast query performance at the cost of storage and refresh time.
ROLAP (Relational OLAP)
An OLAP approach where data remains in relational tables and queries are executed dynamically using SQL, providing better scalability for large datasets.
Technical Terms
Catalog (Database)
A container for database schemas, used by some database systems (e.g., Snowflake, Databricks) to organize database objects.
Schema (Database)
A logical container for database objects such as tables, views, and procedures within a database or catalog.
Connection
In MSL, a configuration that defines how to connect to a specific database catalog and schema.
Dialect
A specific variant of SQL or query language associated with a particular relational database or business intelligence system (e.g., Snowflake, Oracle, Databricks).
Expression
A database-specific calculation or formula defined in MSL using relational database or business intelligence system syntax. Expressions can be dialect-specific to accommodate differences between systems.
Aggregation
A calculation that combines multiple rows of data into a single summary value using functions like SUM, AVG, COUNT, MIN, or MAX.
MSL-Specific Terms
Semantic Model Catalog
The top-level MSL object that organizes and contains multiple semantic models, connections, and shared configurations.
Fact Semantic Table
The semantic table is designated as the central fact table in a model, containing the metrics to be analyzed.
Role-Playing Template
Configuration in MSL that defines how a dimension's names are modified when used in different contexts (e.g., adding "Order" or "Ship" prefixes to the Date dimension).
Category
Semantic metadata assigned to dimensions to influence visualization behavior in BI tools (e.g., geographic roles, data types).
BI and Analysis Terms
KPI (Key Performance Indicator)
A measurable value that demonstrates how effectively an organization is achieving key business objectives. In MSL, KPIs are implemented as metrics.
Slice and Dice
The process of viewing data from different perspectives by selecting specific dimension values (slicing) or rearranging dimensions (dicing).
Filter
A condition applied to limit the data included in an analysis based on dimension or metric values.
Granularity
The level of detail in data. Fine granularity refers to detailed data (e.g., individual transactions), while coarse granularity refers to summarized data (e.g., monthly totals).