Skip to content

Core Concepts

  • Connection: Defines access to physical database tables (and views).
  • Semantic Table: Logical collection of semantic columns, calculated semantic columns, metrics, and their database table context
  • Model: Container that groups semantic tables and defines their relationships.
  • Perspectives: Deployable subsets of a model. (FUTURE)
  • Catalog: Grouping of interrelated connections, semantic tables, and models.
  • Composite Model: A Model that combines multiple others. (FUTURE)

What's in this repository?

SHL aims to promote the building of reusable models and semantic objects.

  1. A YAML-based Language Specification: The SHL specification is documented and encompasses tabular and multidimensional constructs.
  2. Pre-built Semantic Models: The GitHub repository contains pre-built semantic models that incorporate standard data models, such as TPC-DS, common training models like Worldwide Importers and AdventureWorks, and marketplace models like Snowplow and CRISP. TBD
  3. SHL CLI: Command line interface (CLI) for validating SHL syntax and expressions. This includes a reference CLI for deploying SHL models to a proprietary semantic layer platform.

SHL Example

The following is an example of an SHL model object:

- name: TPC-H Orders Customer-Nation-Region
  definition: Orders fact with snowflake Customer (Nation/Region) dimension

  fact_semantic_table: Orders

  joins:
    - to_semantic_table: Customer Nation Region
      from_column: o_orderkey

SHL Object Hierarchy

SHL Object Documentation

Glossary - Definitions of key terms used throughout SHL documentation.

The following sections describe the different SHL object types as well as the properties available for each:

  • Catalog - Top-level file directory that hosts model files and optionally defines their common settings.
  • Model - Defines the logical, business-friendly representation on top of the physical data.
  • Semantic Table - user-friendly, business-oriented structure that organizes raw data for analysis by relating quantitative values (measures) to descriptive categories (dimensions)
  • Row Security - Defines row-level data access rules for users and groups.
  • Connection - Defines a database catalog and schema for connecting datasets to the physical data platform.
  • Composite Model - Defines a model made up of multiple other models.

Tutorial Models

  1. Internet Sales Model - a simple, single-factor model derived from the fictitious AdventureWorks retail dataset.
  2. World Wide Importers Model - a more complex, multi-fact model representing a fictional wholesale and distribution company.

The following graphic illustrates the key SHL objects and their relationships:

SHL Object Relationships

The following graphic illustrates the key SHL objects and their relationships:

Inheritance Diagram

---
config:
    class:
        hideEmptyMembersBox: true
---
classDiagram

`SHL Object` <|-- `Semantic Catalog`
`SHL Object` <|-- `Semantic Connection`
`SHL Object` <|-- `Semantic Model`
`SHL Object` <|-- `Semantic Table`
`SHL Object` <|-- Level

`SHL Object` <|-- `BI Object`

`BI Object` <|-- `Semantic Column`
`BI Object` <|-- Metric
`BI Object` <|-- Hierarchy

class `SHL Object` {
    <<interface>>
    String name [1..1]
    String definition
    String objectID
}

class `BI Object`{
    <<interface>>
    Boolean hide
    String folder
}

Composition Diagram

---
config:
    class:
        hideEmptyMembersBox: true
---
classDiagram

`Semantic Catalog` *-- `Semantic Connection` 
`Semantic Catalog` *-- `Semantic Table`
`Semantic Catalog` *-- `Semantic Model`

`Semantic Model` o-- `Semantic Connection` 
`Semantic Model` o-- `Semantic Table`:fact 
`Semantic Model` *-- `Join` 

`Join` o-- `Semantic Table`:joined dimensions

Semantic Model Catalog

TBD: root folder for now until it gets properties

SHL Object

name

  • Type: string
  • Required: Yes

Identifies the object. Must be unique among its peers. For example,

  • all connections in a catalog
  • all models in a catalog
  • all semantic tables in a catalog
  • all columns and metrics in a semantic table

It is used to identify the object in the BI tool. It can have letters, numbers, spaces, and underscores. For example, "First Name" or "first_name".

It is used to generate database-specific names representing the object in the database layers

The name is used to generate database-specific identifiers when the id parameter is missing.. For example, the "Sales Amount" name can be turned into "sales_amount" in the database semantic layer.

TBD: Do we need unique and business names?

id

  • Type: string
  • Required: No

The object's physical name in the database. When id is not specified, the name is used by replacing spaces with underscores and applying the database canonical case (e.g., UPPERCASE for Oracle and lower_case for Databricks). For example, name: First Name id: first_name # Databricks id: FIRST_NAME # Oracle

definition

  • Type: string
  • Required: No

The object's description.

objectID

  • Type: string
  • Required: No

Internal application identifier string

BI Object

hide

  • Type: string
  • Required: No

Set it to 'true' to hide the object in the client tool. The default is false.

folder

  • Type: string
  • Required: No

The name of a Folder that groups the object.

BI tools can show Table and Folder groups. For example, Folders group objects inside Tables.

General Rules

Database object naming

A column or table (e.g., source) name property must use the database canonical name (e.g., ORDER for Oracle and order for PostgreSQL). The name is used as-is for referencing columns (e.g., order.id). The application surrounds the name with database-specific quotes when it contains special characters, like spaces (e.g., "Recent Order".id for Oracle).

TBD: Reference a column name

The following table shows how to format column names depending on the characters they contain.

Case Source column name(s) Reference expression(s) Notes
No spaces revenue expr: "revenue"
expr: 'revenue'
expr: revenue
Use double quotes, single quotes, or no quotes around the column name.
With spaces First Name expr: "First Name" Use backticks to escape spaces. Enclose the entire expression in double quotes.
Column name with spaces in a SQL expression First Name and Last Name expr: CONCAT(First Name, ,Last Name) If the expression doesn't start with backticks, double quotes are not necessary.
Quotes are included in the source column name "name" expr: '"name"' Use backticks to escape the double-quotes in the column name. Enclose that expression in single quotes in the YAML definition.

Use expressions with colons

Case Expression Notes
Expressions with colons expr: "CASE WHENCustomer Tier= 'Enterprise: Premium' THEN 1 ELSE 0 END" Wrap the entire expression in double quotes for correct interpretation

[!NOTE]

YAML interprets unquoted colons as key-value separators. Always use double quotes around expressions that include colons.

Multi-line indentation

Case Expression Notes
Multi-line indentation expr: > CASE WHEN revenue > 100 THEN 'High' ELSE 'Low' END Indent the expression under the first line

[!NOTE]

Use the > block scalar after expr: for multi-line expressions. All lines must be indented at least two spaces beyond the expr key for correct parsing.

Source

You can use a table-like asset or a SQL query as the source for your metric view. To use a table-like asset, you must have at least SELECT privileges on the asset.

Use a table as a source

To use a table as a source, include the fully-qualified table name, as in the following example.

source: samples.tpch.orders

Use a SQL query as a source

To use a SQL query, write the query text directly in the YAML.

source: SELECT * FROM samples.tpch.orders o
  LEFT JOIN samples.tpch.customer c
  ON o.o_custkey = c.c_custkey

YAML Syntax Tips

Optional singular aliases for plural properties

In SHL, list properties (like columns) always support one or many values. For convenience, a singular alias (e.g., column) may also be accepted, but only the plural form is documented and recommended. The singular form is designed for advanced users who prefer to type shorter text for one-value cases.

column: c1  # one-liner and short but allows you to specify one value 
# OR
columns:    # takes multiple lines and enables you to specify multiple values
    - c1