Komodo Health is dedicated to reducing the global burden of disease through smarter use of data. The Senior Data Engineer will play a critical role in shaping core data products that power the Healthcare Map, focusing on transforming complex healthcare datasets into reliable data assets and building scalable data pipelines.
Responsibilities:
- Build, operate, and optimize large-scale production data pipelines using Python, SQL, Airflow, cloud infrastructure, and distributed processing frameworks
- Transform massive healthcare claims, EHR, and reference datasets into trusted, performant Healthcare Map data products and serving-ready data assets
- Strengthen pipeline reliability through data quality checks, validation, lineage, observability, monitoring, and alerting
- Debug complex data, system, and performance issues across computationally intensive workflows
- Partner with Data Product Quality, Product, Platform, and Engineering teams to translate healthcare data needs into scalable technical solutions
- Contribute to system design, architecture, code quality, testing, documentation, CI/CD, and rotational production support
- Enable downstream analytics, product, and AI/ML use cases through high-quality, well-modeled, reliable data
Requirements:
- Healthcare data experience across claims, clinical, RWE, provider, patient, or life sciences datasets, including coding systems such as ICD-10, CPT, NDC, or NPI
- Strong hands-on experience building, operating, and debugging production-grade data pipelines at scale
- Advanced Python and SQL skills, with experience in Airflow or similar workflow orchestration tools
- Experience with Spark or comparable distributed data processing frameworks
- Proven experience designing and operating data solutions in AWS
- Strong instincts for data quality, reliability, root-cause analysis, and production troubleshooting
- Ability to communicate technical trade-offs clearly and collaborate with engineering, product, and data partners
- Comfort using AI-assisted engineering tools for productivity, debugging, documentation, and technical exploration
- Experience delivering external-facing data products through customers, APIs, serving layers, or production access patterns
- Ability to optimize high-scale data architectures for performance, cost, versioning, and large-volume productization
- Experience applying AI or agentic workflows to engineering, data quality, delivery, or operations
- Success in high-growth or ambiguous environments that require balancing architecture, speed, and quality