Netflix is a company dedicated to entertaining the world through innovative storytelling and technology. They are seeking a Senior Distributed Systems Engineer to design, build, and operate analysis tooling that supports A/B test analysis and ensures the reliability and performance of experimentation workflows.
Responsibilities:
- Build and evolve critical experiment analysis tooling
- Build, maintain, and improve real-time and batch analysis workflows for experiment analysis, regression detection, and more
- Own reliability and performance
- Participate in on-call, lead incident response, and drive long-term reliability improvements
- Instrument services with rich observability (metrics, logs, traces) and continuously tune for resilience, performance, and scalability
- Shape data and integration surfaces
- Collaborate with teams using technologies like Flink, Spark, Elasticsearch, and Druid to ensure experimentation data is correct, timely, and usable
- Define clear data and API contracts for consumers and pipelines
- Partner with product engineering teams
- Deeply understand experimentation workflows across Netflix
- Simplify and improve the experience for monitoring and deciding experiments
Requirements:
- Strong understanding of experimentation lifecycle and risks
- Backend & distributed systems depth
- Strong coding & operational rigor
- High-impact collaboration & product sense
- A background in data science/statistics is a big plus