SQL on FHIR
2.1.0-pre - release
SQL on FHIR - Local Development build (v2.1.0-pre) built by the FHIR (HL7® FHIR® Standard) Build Tools. See the Directory of published versions
The broad ecosystem of tools used for data analysis, not specific to any technology stack, programming language, or use case. Applies equally to such diverse systems as Microsoft Excel, a PostgreSQL database, Tableau or Python code in a Juypter notebook.
The third conceptual "layer" in this specification. This layer is where analytic tools are applied to the now tabular representations of the source FHIR data. This specification is agnostic to the tooling used and the specific queries performed in this layer.
The optional first conceptual layer described by this specification. The purpose of this layer is to provide access to a lossless representation of the FHIR data for onward data transformation.
Examples of this layer include FHIR NDJSON files, FHIR resources held within a JSON-typed column within a relational database, or Parquet files containing full representations of FHIR resources.
The data may optionally be "annotated" to improve the ease of use or performance of these transformation queries.
Examples of possible annotations include:
Informal synonym for "tabular". Not recommended.
The abstract concept of a collection of related data organized in rows and columns. Does not imply a physical table within a database system since it can also refer to R or Pandas "dataframe" and similar abstractions.
FHIR data projected into a tabular form. Note there is no requirement that the data be persisted or materialized, i.e. the table may be dynamic and/or ephemeral.
The process of extracting repeating elements of a resource into a row for each item.
View Definitions are portable, self-documenting, datastore-independent artifacts that embody a particular choice of tabular projection of single FHIR resource type's elements, possibly from multiple levels of its hierarchy. They are the primary artifacts of this specification.
A naive tabular representation of FHIR resources would create a difficult-to-comprehend, and poorly performing database and world require an impractical number of joins for even simple queries. Therefore, to support both performant access and simplified query creation, View Definitions are a satisfying solution.
These artifacts are read by system-specific "view runners" but are intended to be portable across many systems and "data layers". Each FHIR resource type intended to be exposed in the "analytic layer" will have at least one corresponding view definition for a given use case. Since analytic use cases vary widely, it is expected that many sets of view definitions will exist.
For example, given the same patient population, the use case of creating tables for billing reports versus the use case of identifying sexual orientation and gender identity inequities would require different data elements to be exposed to the analytic layer and would be best served by different view definitions.
The second "layer" in the conceptual architecture and the central focus of this specification. The purpose of the "view layer" is to define and transform FHIR data into the desired tabular form.
The "view layer" is composed of sets of two main components: "view definitions" and "view runners".
System-specific tools or libraries that apply view definitions to the "data layer" creating the tabular views of the "analytics layer".
Example view runners may include: