Skip to main content

ViewDefinition Modeling

FHIR4DS implements the HL7 SQL-on-FHIR v2 specification, providing a standardized, portable way to flatten complex FHIR resources into analytical tables. This allows data scientists to query FHIR using standard SQL joins and aggregations without manual ETL.

1. Overview

The viewdef module translates declarative JSON ViewDefinitions into optimized DuckDB SQL. It allows you to define projections of FHIR resources that are independent of the underlying storage engine.

  • Spec Compliant: 100% compliance with the SQL-on-FHIR v2 specification (140 tests).
  • Portable: Views defined in FHIR4DS can be used across any spec-compliant implementation.
  • SQL-Native: Generates inspectable DuckDB SQL that leverages columnar performance.

2. Core Concepts

A ViewDefinition consists of three primary components:

The Resource Focus

Specifies the base FHIR resource type (e.g., Patient, Observation, Encounter).

Select and Column

Defines the mapping from FHIRPath expressions to SQL columns.

  • Path: The FHIRPath used to extract data.
  • Name: The resulting SQL column name.

Flattening (forEach)

Handles one-to-many relationships (e.g., a Patient with multiple names or an Observation with multiple components).

  • forEach: Creates one row for every matching element (Inner Join semantics).
  • forEachOrNull: Preserves the parent row even if the collection is empty (Left Join semantics).

3. Quick Start

Flattening a resource involves defining the mapping and calling the generator.

import json
import fhir4ds

# 1. Define the View (Portable JSON)
view_definition = {
"resource": "Patient",
"select": [{
"column": [
{"path": "id", "name": "patient_id"},
{"path": "gender", "name": "gender"},
{"path": "name.first().family", "name": "last_name"}
]
}]
}

# 2. Generate optimized DuckDB SQL
sql = fhir4ds.generate_view_sql(json.dumps(view_definition))

# 3. Execute
con = fhir4ds.create_connection()
df = con.execute(sql).df()

4. Complex Modeling

ViewDefinitions support deep nesting and filtering, enabling the creation of production-grade analytical datasets.

view_definition = {
"resource": "Observation",
"where": [{"path": "status = 'final'"}],
"select": [
{"column": [{"path": "id", "name": "obs_id"}]},
{
"forEach": "component",
"column": [
{"path": "code.coding.first().code", "name": "comp_code"},
{"path": "value.ofType(Quantity).value", "name": "comp_value"}
]
}
]
}

This example produces one row per observation component, filtering out any resources that are not in a 'final' status.


tip

Spec & Compliance For technical details on the generator class and the full compliance report, see the ViewDefinition API Reference.