Skip to main content

fhir4ds.sources

The sources module provides Zero-ETL source adapters that mount external data as the standard resources view in DuckDB, enabling CQL measures, FHIRPath queries, and ViewDefinitions to run directly against external data without copying it.

Top-Level API

fhir4ds.attach(con, adapter)

Registers a source adapter against an existing DuckDB connection.

fhir4ds.attach(con, adapter)
ParameterTypeDescription
conduckdb.DuckDBPyConnectionAn active DuckDB connection.
adapterSourceAdapterAny object implementing the SourceAdapter protocol.

Raises:

  • SchemaValidationError — if the adapter's view does not conform to the required schema.
  • TypeError — if adapter does not implement the SourceAdapter protocol.

fhir4ds.detach(con, adapter)

Unregisters an adapter, dropping the resources view and releasing any external connections.

fhir4ds.detach(con, adapter)
ParameterTypeDescription
conduckdb.DuckDBPyConnectionAn active DuckDB connection.
adapterSourceAdapterThe adapter to unregister.

fhir4ds.create_connection(source=...)

The source parameter accepts any SourceAdapter. When provided, the adapter is registered immediately after connection creation:

con = fhir4ds.create_connection(
source=FileSystemSource('/data/fhir/**/*.parquet')
)

SourceAdapter Protocol

All adapters implement the SourceAdapter protocol, which requires two methods:

class SourceAdapter(Protocol):
def register(self, con) -> None: ...
def unregister(self, con) -> None: ...

register(con) — Creates the resources view and validates the schema. Must be idempotent (safe to call multiple times). Must call validate_schema() before returning.

unregister(con) — Drops the resources view and releases external connections. Safe to call even if register() was never called.

Schema Contract

The resources view must expose exactly these columns:

ColumnTypeDescription
idVARCHARFHIR resource logical ID
resourceTypeVARCHARFHIR resource type (e.g. "Patient")
resourceJSONComplete FHIR resource as JSON
patient_refVARCHARRaw logical ID of the Patient this resource belongs to

SchemaValidationError

class SchemaValidationError(Exception)

Raised at registration time when a source adapter's resources view does not conform to the required schema. The error message identifies the missing or mistyped column and the adapter class.


validate_schema()

validate_schema(con, adapter_class_name: str) -> None

Validates that the resources view conforms to the required schema. Every adapter must call this immediately after creating the view.

ParameterTypeDescription
conduckdb.DuckDBPyConnectionAn active DuckDB connection.
adapter_class_namestrName of the calling adapter class (used in error messages).

Raises: SchemaValidationError if the view is absent, or any required column is missing or has the wrong type.


quote_identifier()

quote_identifier(name: str) -> str

Safely quotes a DuckDB identifier to prevent SQL injection. Escapes internal double-quotes by doubling them, then wraps in double-quotes.

Used internally by PostgresSource and ExistingTableSource to safely handle user-supplied table and column names.


Built-In Adapters

AdapterPurposeDetails
FileSystemSourceParquet, NDJSON, Iceberg (local or cloud)API Reference →
PostgresSourceFHIR JSON in Postgres columnsAPI Reference →
ExistingTableSourceWrap pre-loaded DuckDB tablesAPI Reference →
CSVSourceCSV files with user-defined projectionAPI Reference →

CloudCredentials

CloudCredentials(provider: str, secret_name: str = None, **kwargs)

Encapsulates DuckDB secret configuration for cloud storage access.

ParameterTypeDescription
providerstr'S3', 'AZURE', or 'GCS'
secret_namestrOptional name for the DuckDB secret. Defaults to fhir4ds_{provider}_secret.
**kwargsProvider-specific credential fields (see below).

Provider-specific fields:

ProviderFields
S3access_key_id, secret_access_key, region, endpoint_url
Azureconnection_string, or account_name + account_key
GCSservice_account_json

Method: configure(con) — Registers a DuckDB secret for this provider.