Back to Projects

Emergency Response Synthetic Data Pipeline

AI/MLData Engineering
Emergency Response Synthetic Data Pipeline

Built for an emergency response technology platform

Overview

Designed a comprehensive synthetic data generation pipeline for an emergency response technology platform. The 60+ page plan indexes 10+ public EMS/health data sources (MIMIC-IV-ED, NEMSIS, NYC/SF open data, NHS), defines vital signs reference tables for 20+ emergency conditions, models temporal progressions (shock, respiratory failure, sepsis, anaphylaxis), and outlines a 5-phase implementation roadmap with HIPAA risk mitigation.

The Problem

The platform needed realistic emergency response training data but couldn't use real patient records due to HIPAA regulations. Existing synthetic data tools produced unrealistic vitals and progression patterns.

My Approach

Cataloged 10+ public EMS/health data sources as statistical foundations. Built vital signs reference tables for 20+ conditions with realistic ranges and progression curves. Designed temporal models for critical conditions (shock, sepsis, anaphylaxis). Created a 5-phase implementation roadmap using Synthea for patient generation, SDV/CTGAN for tabular synthesis, and domain rules for clinical validity.

Key Results

0+
Conditions
Emergency scenarios modeled
0+
Data Sources
Public health datasets indexed
0
Phases
Implementation roadmap

Tech Stack

PythonSyntheaSDV/CTGANFakerPandasNumPyPublic Health Datasets