Sprinter Health is reimagining how people access care by bringing it directly to their homes. They are seeking a Staff Site Reliability Engineer to build the reliability, infrastructure, and security foundations that power last-mile healthcare delivery at scale, focusing on operational efficiency and system resilience.
Responsibilities:
- Design, build, and improve the infrastructure that powers Sprinter’s patient care, clinician operations, internal tooling, and partner-facing systems
- Improve reliability across distributed systems, cloud infrastructure, CI/CD, observability, and incident response
- Raise the security baseline across cloud infrastructure, access controls, secrets management, identity, and operational workflows
- Build and maintain infrastructure as code using Terraform and related tooling
- Automate manual infrastructure and operational processes through scripting, tooling, and platform improvements
- Partner with engineering teams to improve system architecture, deployment practices, monitoring, logging, and alerting
- Troubleshoot complex issues across infrastructure, application, data, and operational boundaries
- Help define reliability, security, and infrastructure standards that allow Sprinter to scale without creating brittle systems
- Support incident response practices, postmortems, operational readiness, and continuous improvement across engineering
- Make pragmatic tradeoffs between reliability, security, speed, and simplicity in a fast-moving startup environment
Requirements:
- Spent 8+ years in site reliability engineering, platform engineering, infrastructure engineering, security engineering, or related technical roles
- Led high-impact infrastructure, reliability, platform, or security projects end to end with minimal oversight
- Built and operated production systems in cloud environments, ideally AWS and/or GCP
- Worked deeply with infrastructure as code, ideally Terraform
- Improved observability, monitoring, logging, alerting, and incident response practices across engineering teams
- Automated infrastructure, deployment, or operational workflows using scripting languages such as Python, Bash, or TypeScript
- Improved cloud security, access management, secrets management, networking, or operational controls
- Troubleshot production issues across application, infrastructure, networking, and deployment layers
- Worked in environments where reliability, security, ambiguity, and speed all matter
- Made technical decisions that balanced immediate business needs with long-term scalability, reliability, and maintainability
- You've built or scaled infrastructure in health tech, logistics, marketplace, fintech, or other operationally complex environments
- You've worked in mid- or growth-stage startups where speed, ambiguity, and pragmatic decision-making were required
- You have experience improving security posture in a practical, engineering-friendly way
- You've helped establish reliability standards, incident response practices, or platform patterns across an engineering org
- You're comfortable working directly with product engineers, data teams, operations, security stakeholders, and technical leadership
- You have experience mentoring engineers and raising the operational bar across a broader engineering team
- You've worked in regulated environments and understand the importance of privacy, security, and compliance best practices
- You have people management experience or interest in growing into broader technical leadership over time