Skip to main content

MongoFhirServerSource

fhir4ds.sources.MongoFhirServerSource mounts current FHIR resources from a Mongo-backed FHIR server as the standard FHIR4DS resources view.

It uses DuckDB's community mongo extension and mongo_scan; it does not use a Python MongoDB client.

Class Signature

MongoFhirServerSource(
connection_string: str,
*,
schema: MongoFhirServerSchema | None = None,
attachment_name: str = "fhir4ds_mongo",
install_extension: bool = True,
)
ParameterTypeDescription
connection_stringstrMongoDB URI passed to DuckDB mongo_scan.
schemaMongoFhirServerSchema | NoneMongo collection and resource layout configuration. Defaults to Helix/icanbwell per-resource collections.
attachment_namestrDuckDB attachment name used only for best-effort collection discovery.
install_extensionboolIf True, runs INSTALL mongo FROM community before LOAD mongo.

MongoFhirServerSchema

MongoFhirServerSchema(
database_name: str = "fhir",
base_version: str = "4_0_0",
collection_strategy: Literal["per_resource", "explicit", "shared"] = "per_resource",
resource_types: tuple[str, ...] | list[str] | None = None,
collections: tuple[MongoResourceCollection, ...] | list[MongoResourceCollection] | None = None,
collection_mappings: Mapping[str, str] | None = None,
shared_collection: str | None = None,
shared_resource_path: str = "$",
shared_id_path: str = "$.id",
shared_resource_type_path: str = "$.resourceType",
shared_current_filter: Mapping[str, Any] | None = None,
shared_deleted_filter: Mapping[str, Any] | None = None,
sample_size: int | None = -1,
include_hidden: bool = False,
hidden_tag_system: str = "https://fhir.icanbwell.com/4_0_0/CodeSystem/server-behavior",
hidden_tag_code: str = "hidden",
patient_reference_paths: tuple[str, ...] | list[str] = (
"$.subject.reference",
"$.patient.reference",
"$.beneficiary.reference",
),
scrub_private_fields: tuple[str, ...] | list[str] = (
"_id",
"_uuid",
"_sourceId",
"_sourceAssigningAuthority",
),
)
ParameterDescription
database_nameMongo database name.
base_versionSuffix used by per_resource collection names, such as 4_0_0.
collection_strategyCollection layout strategy.
resource_typesResource types to mount. Required for per_resource and shared; optional filter for explicit mappings.
collectionsFull explicit MongoResourceCollection entries.
collection_mappingsShorthand mapping from resource type to collection name for root-document layouts.
shared_collectionCollection name used by shared strategy.
shared_resource_pathJSON path to the FHIR resource in shared-collection rows.
shared_id_pathJSON path to the FHIR resource id in shared-collection rows.
shared_resource_type_pathJSON path to the FHIR resource type in shared-collection rows.
shared_current_filterInclusion Mongo filter applied to every shared-collection resource type.
shared_deleted_filterDelete marker filter wrapped in $nor.
sample_sizeDuckDB mongo_scan sample size. Use -1 for full inference, or None to omit the option.
include_hiddenIf False, exclude resources with the configured hidden tag.
patient_reference_pathsJSON paths checked for patient references inside the FHIR resource JSON.
scrub_private_fieldsRoot fields removed from emitted resource JSON with json_merge_patch.

MongoResourceCollection

MongoResourceCollection(
resource_type: str,
collection_name: str,
resource_path: str = "$",
id_path: str = "$.id",
resource_type_path: str = "$.resourceType",
current_filter: Mapping[str, Any] | None = None,
deleted_filter: Mapping[str, Any] | None = None,
)

Use this class for custom collection names, wrapped FHIR resources, tenant filters, or non-standard delete markers.

Methods

register(con)

Loads DuckDB's mongo extension, creates resources, and calls validate_schema().

Raises SchemaValidationError if the extension cannot load, Mongo cannot be scanned, or the resulting view does not expose the required schema.

unregister(con)

Drops the resources view and detaches the best-effort discovery attachment if one was created.

supports_incremental()

Returns False.

Example

import fhir4ds
from fhir4ds.sources import MongoFhirServerSchema, MongoFhirServerSource

source = MongoFhirServerSource(
"mongodb://readonly:[email protected]:27017",
schema=MongoFhirServerSchema(
database_name="fhir",
resource_types=("Patient", "Observation"),
include_hidden=False,
),
)

con = fhir4ds.create_connection(source=source)
rows = con.execute("""
SELECT id, resourceType, patient_ref
FROM resources
ORDER BY resourceType, id
""").fetchall()

Security Notes

Mongo URIs, database names, collection names, JSON paths, and filter documents are quoted as SQL string literals before being passed to DuckDB. Error messages redact credentials and sensitive URI query parameters.

DQM Materialization

MongoFhirServerSource remains read-only. HAPI-like queues, change streams, result storage, and generated MeasureReport publishing live in fhir4ds.dqm.mongo_materialization and the fhir4ds dqm mongo ... CLI.

Install the optional dependency set before using that worker:

python3 -m pip install "fhir4ds-v2[mongo]"

Mongo change streams require a replica set or sharded cluster. They are the Mongo-native analogue to HAPI PostgreSQL triggers plus LISTEN/NOTIFY. The materialization worker enables patient-scoped source reads by default via worker.source_patient_pushdown: true.