Skip to content

Overview

Introduction

Modern organizations store massive amounts of data in their warehouses and lakes. Yet the real challenge is not storage, but transforming raw data into actionable insights. Business leaders want answers like: Which region drives the most revenue? How many repeat customers do we have? What are the trends this quarter compared to last?

Answering these questions directly from raw SQL tables is slow and inconsistent. Different analysts may interpret data differently, leading to multiple versions of “truth.”

Semantic Hub Language addresses this challenge. They provide a consistent, reusable, and business-friendly layer of meaning on top of raw data. This semantic layer defines measures, dimensions, hierarchies, and rules, enabling BI and AI tools to query the warehouse uniformly.


Why OLAP Still Matters

The idea of Online Analytical Processing (OLAP) has been around for decades. At its core, OLAP systems enable easier slicing, dicing, and analysis of data from multiple perspectives.

Two traditional approaches evolved:

  • MOLAP (Multidimensional OLAP) Think of MOLAP as building a giant “data cube” in advance. Every possible combination of product, region, and time period is pre-calculated. Reports are lightning-fast, but the cube is expensive to build and slow to refresh when new data arrives. A MOLAP model.
  • ROLAP (Relational OLAP) ROLAP keeps data in relational tables organized as star or snowflake schemas. Queries are run live with SQL, calculating results on demand. This scales better to big data but can be slower unless carefully tuned. A star schema.

Both approaches shaped BI as we know it. However, as data volumes increased, organizations required a hybrid solution that combines the scalability of ROLAP with the usability of MOLAP.


The MetaKarta Approach

MetaKarta introduces a semantic model that blends the best of both worlds:

  • To the BI user, the model resembles MOLAP: a clean list of measures and dimensions, making it easy to explore.
  • Under the hood, MetaKarta builds on ROLAP, where data remains in place and is queried directly in the warehouse.

Depending on the database, MetaKarta adapts:

  • Database Semantic Layer
  • Implements standard metrics, performance caching optimizations, and column and row security.
  • Uses native semantic objects where possible (Snowflake Semantic Views, Databricks Metric Views, Oracle Analytic Views).
  • Falls back to views if the database does not support semantic objects.
  • BI Semantic Layer
  • Extends the database layer with MOLAP-like hierarchies and drill paths defined in the model.
  • Provides compatibility views for BI tools that cannot read database semantic objects directly.

The result: analysts and executives see a consistent, business-friendly view of data, regardless of the underlying platform.


Measures: Numbers That Answer Questions

Measures are the heartbeat of analysis. They represent quantifiable business facts—numbers we care about.

Examples:

  • Total Sales = SUM(amount × quantity)
  • Number of Customers = COUNT DISTINCT(customer_id)
  • Average Session Duration = SUM(session_duration) ÷ COUNT(visits)

When you define measures in MetaKarta:

  • Start from the fact table (the table with events such as sales or clicks).
  • Decide which fields should be aggregated and how.
  • Use business terminology so measures are recognizable to non-technical users.

Think of measures as the verbs in your data stories: How much? How many? How long?


Dimensions: Context for the Numbers

Numbers alone don’t tell the story. Dimensions provide the who, what, where, and when that bring measures to life.

Examples:

  • Customer Gender → Compare sales by men vs. women.
  • Order Date → Month → Track revenue growth by month.
  • Customer Location → State → Spot your strongest regions.

In MetaKarta, dimensions are modeled virtually. They may come from:

  • Fact table columns (e.g., order_date).
  • Separate dimension tables (e.g., customers, products).
  • Even multiple joined datasets.

Dimensions are the nouns in your data stories. They let users slice, filter, and drill into measures from every angle.


Star Schema: Organizing the Story

To understand how measures and dimensions connect, it helps to picture a star schema:

  • The fact table sits at the center (e.g., Sales Fact).
  • Around it, dimension tables provide descriptive detail (e.g., Customer, Product, Date).
  • Lines connect facts to dimensions, forming a star.

A snowflake schema is a more normalized variant, where dimensions themselves branch into sub-tables.

A star schema.

MetaKarta uses these concepts virtually. Instead of forcing you to physically remodel data, you can overlay a logical star schema on your existing tables. This flexibility enables you to adapt to different warehouses without needing to restructure the data.


MetaKarta Semantic Language (MSL)

To capture all these ideas in a structured, reusable way, MetaKarta provides the MetaKarta Semantic Language (MSL).

  • YAML-based: Easy to read and version-control.
  • Git-integrated: Models live alongside code for collaboration and history.
  • Composable: Each object (table, measure, dimension, hierarchy) is defined separately but fits into the larger model.
  • Deployable: Models can be pushed to warehouses and BI tools, ensuring consistency across the stack.

With MSL, data engineers define it once, and every BI user benefits from a consistent semantic layer.