Emergency Response Synthetic Data Pipeline
Built for an emergency response technology platform
Overview
Designed a comprehensive synthetic data generation pipeline for an emergency response technology platform. The 60+ page plan indexes 10+ public EMS/health data sources (MIMIC-IV-ED, NEMSIS, NYC/SF open data, NHS), defines vital signs reference tables for 20+ emergency conditions, models temporal progressions (shock, respiratory failure, sepsis, anaphylaxis), and outlines a 5-phase implementation roadmap with HIPAA risk mitigation.
The Problem
The platform needed realistic emergency response training data but couldn't use real patient records due to HIPAA regulations. Existing synthetic data tools produced unrealistic vitals and progression patterns.
My Approach
Cataloged 10+ public EMS/health data sources as statistical foundations. Built vital signs reference tables for 20+ conditions with realistic ranges and progression curves. Designed temporal models for critical conditions (shock, sepsis, anaphylaxis). Created a 5-phase implementation roadmap using Synthea for patient generation, SDV/CTGAN for tabular synthesis, and domain rules for clinical validity.