Synthetic Patient Data Generation
Overview: What is Being Generated?
The synthetic data mimics real-world immunization records from provincial registries while ensuring FHIR-compliant JSON format for interoperability. Differences between BC and ON data models are accounted for, including variations in fields such as allergy tracking.
FHIR Patient Resource Structure
A FHIR Patient Resource consists of core fields and extensions, ensuring consistency across jurisdictions.
Field | FHIR Path | Example Value | Logic |
---|---|---|---|
Patient ID | Patient.id | "patient-001" | Unique identifier. |
Name | Patient.name | { "family": "Singh", "given": ["Simar"] } | Randomized names using Faker. |
Gender | Patient.gender | "male", "female", "other" | Random selection. |
Birth Date | Patient.birthDate | "2012-06-15" | Randomized within given age range. |
Address | Patient.address | { "city": "Toronto", "state": "ON" } | Province-specific logic. |
Provincial Differences in Data Generation
British Columbia (BC) Specific Fields
- Allergy information is included using FHIR `AllergyIntolerance` resource.
- Uses SNOMED CT-coded allergy types, severity levels, and reaction dates.
Ontario (ON) Specific Fields
- Ontario does not track allergy information in immunization records.
- The script skips allergy generation for ON patients.
Immunization Data Generation (FHIR Standard)
Field | FHIR Path | Example Value | Logic |
---|---|---|---|
Vaccine Type | Immunization.vaccineCode | "MMR", "Influenza" | Random selection. |
Manufacturer | Immunization.manufacturer | "Pfizer", "Moderna" | Random manufacturer assignment. |
Summary of Key Features
- FHIR-Compliant: Structured to align with HL7 FHIR.
- Synthetic but Realistic: Uses Faker for realistic patient data.
- Handles Provincial Variations: BC and ON have different data models.
- Includes Adverse Reactions & Exemptions: Adds realism for testing.
PT-to-PT Transfer Assumptions
Overview: What is PT-to-PT Transfer?
When a patient moves or seeks healthcare in another province, their immunization record needs to be securely transferred. This Proof of Concept (PoC) focuses on the technical feasibility of such transfers while leaving governance, consent, and policy discussions out of scope.
How the Transfer Works
The transfer follows a structured push-based approach:
- The originating province (data owner) initiates the transfer.
- A secure API transmits immunization records between provinces.
- The transfer occurs only after external consent and authorization.
- The receiving province acknowledges and integrates the record.
Data Transferred Between Provinces
1. Patient Information
- Unique patient identifier within the provincial system.
- Full name, birth date, and gender.
- Address, including city, province, and postal code.
- Health card number (if applicable) for identity matching.
2. Immunization History
- Vaccine type (CVX codes).
- Date of administration and dose number.
- Manufacturer and lot number.
- Site of administration (e.g., left arm, right arm).
- Adverse reactions and exemptions (if applicable).
3. Metadata for PT-to-PT Transfers
- Transfer origin marker indicating the source jurisdiction.
- Receiving province acknowledgment confirming integration.
Technical Flow of the Transfer
- Patient relocates or seeks healthcare in another province.
- Originating province completes consent and authorization.
- A secure API request is triggered by the originating province.
- Receiving province ingests and processes the immunization record.
- The record is now available in the receiving province’s registry.
Key Assumptions and Considerations
- This PoC does not handle consent management; it must be externally managed.
- The push-based model ensures data is only transferred when authorized.
- Manual transfer requests (fax, email, policy-driven) remain possible but are out of scope.
- The PoC leverages FHIR repositories to validate secure data exchange.
Testing and Simulation Approach
- FHIR-based simulation using synthetic data.
- Initial scope focused on MMR vaccine records, expandable in future phases.
- Optional UI demonstration to visualize transferred records.
Aggregation & PHAC Data Access
Overview: Why is Aggregation Needed?
PHAC does not require full patient-level data but instead needs structured, summarized reports for national immunization monitoring. Aggregation transforms raw immunization records into anonymized datasets, ensuring accuracy while protecting privacy. This process maintains consistency across provinces, even when different immunization tracking systems are used.
PHAC does not access or process identifiable Personal Health Information (PHI) at any stage. All de-identification is performed at the Provincial/Territorial (PT) level before any data is shared with PHAC. The information shared with PHAC is structured, anonymized, and formatted for public health reporting in compliance with privacy regulations. Aggregation ensures uniform national immunization reporting while aligning with PT-specific privacy frameworks.
Key Data Captured in Aggregation
The aggregation logic extracts the following key fields:
Field | Description |
---|---|
Reference Date | Date of immunization event (reported at an aggregate level). |
Jurisdiction | Province where the immunization was recorded (e.g., BC, ON). |
Age Group | Categorized age ranges (e.g., 0-2 years, 3-5 years, etc.). |
Gender | Aggregate counts by gender category (Male, Female, Other). |
Vaccine Type | Type of vaccine administered (e.g., MMR, COVID-19). |
Dose Count | Total number of doses administered in the reporting period. |
Total Patients Vaccinated | Unique number of individuals vaccinated within the reporting period. |
How Data Aggregation Works
- Extract immunization records from FHIR repositories at the PT level.
- De-identify data by removing personally identifiable details (e.g., names, health card numbers) entirely at the PT level before aggregation.
- Categorize data by jurisdiction, age group, gender, and vaccine type.
- Summarize dose counts and calculate total vaccinated individuals.
- Format the final dataset into a structured, anonymized report in compliance with PHAC's reporting framework.
Benefits for Public Health and PHAC
- Reduces complexity by providing summarized, structured reports rather than raw data.
- Ensures privacy by removing personal identifiers and focusing on aggregate statistics.
- Standardizes immunization reporting across jurisdictions for consistency and interoperability.
- Scales efficiently to include new vaccines and evolving public health priorities.
Addressing Key Concerns and Clarifications
PHAC's Role in De-Identification: The previous document wording implied that PHAC de-identifies data, which is incorrect.
Clarification: PTs are fully responsible for de-identification before sharing data. PHAC does not access raw patient-level data.
PHAC's Role in Aggregation: The document previously suggested PHAC "fetches" patient data and aggregates it, which is inaccurate.
Clarification: Aggregation is conducted entirely at the PT level. PHAC receives only aggregated, anonymized data.
Technical Infrastructure
Overview
The technical architecture enables secure, real-time immunization data exchange between jurisdictions while ensuring that each province maintains full control over its data. It follows a federated model, using a combination of API gateways, security layers, and standardized data exchange mechanisms.
Key Components
- Provincial Immunization Systems: Each province maintains its own immunization data repository.
- FHIR-Based Data Exchange: Standardized API interactions ensure compatibility between different systems.
- Access Control Gateway: Manages authentication, authorization, and data access policies.
- Interoperability Layer: Enables data transformation and validation to align different provincial data formats.
- Audit & Compliance Framework: Ensures all data access is logged and follows regulatory requirements.
Federated Immunization Data Architecture (UJ-1)
The federated model ensures that each province maintains local control over its immunization records while supporting national-level aggregation for public health surveillance. The architecture consists of:
- FHIR Immunization Registries: Each province maintains its own secure database.
- Synthetic Data Generator: Creates test data for validation.
- SMART Patient Viewer: Allows healthcare providers to view immunization records.
- Aggregator: Summarizes and anonymizes immunization records before sharing with PHAC.
- Federator (PHAC): Receives de-identified, aggregated data for national reporting.
- R-Shiny Dashboards: Provides real-time analytics and insights.

PT-to-PT Transfer Workflow (UJ-2)
The PT-to-PT transfer mechanism enables secure, structured movement of immunization records when a patient relocates between jurisdictions. This workflow ensures that records are securely exchanged without centralization.
- API Gateway: Facilitates secure and authenticated communication.
- FHIR Data Transfer: Ensures records are formatted correctly.
- Message Queue: Manages retries and queued requests.
- Outbound Transfer Service: Extracts and sends immunization data.
- Inbound Transfer Service: Receives and validates incoming records.

Data Flow
- A request for immunization data is initiated by an authorized system.
- The Access Control Gateway verifies authentication and consent requirements.
- Once approved, data is retrieved from the provincial immunization system.
- The interoperability layer processes and transforms the data to a standard format.
- The response is securely delivered back to the requester.
Security & Compliance
- End-to-end encryption is enforced for all data exchanges.
- Mutual TLS authentication ensures secure API communication.
- Audit logs are maintained to track all requests and access events.
- Provincial data sovereignty is preserved by keeping raw data within jurisdictions.
Scalability & Infrastructure
- Containerized microservices enable efficient scaling.
- Event-driven architecture allows real-time data synchronization.
- Cloud-hosted infrastructure supports high availability and fault tolerance.
Future Roadmap & Next Steps
Upcoming developments and the roadmap for the Interoperable Immunization Data Initiative.