Netflix is a leading entertainment company focused on innovation and storytelling. They are seeking a Software Engineer 5 for the Agent Platform team, responsible for designing and building AI agent infrastructure and ensuring its reliability and observability at scale.
Responsibilities:
- Design, build, and operate the Agent SDK and MCP Gateway that Netflix engineers use to build, deploy, and run AI agents in production
- Build agents and agent infrastructure across the full lifecycle — plan/act/observe loops, tool and MCP integrations, deployment, and day-2 operations
- Make evaluation a first-class part of the platform: build the tracing, eval suites, and quality signals that let teams measure agents, catch regressions, and iterate to make them better
- Own reliability, observability, and guardrails for non-deterministic systems running at very high scale
- Lead cross-functional initiatives with ML scientists, data scientists, product managers, and other AI Platform teams
- Rapidly iterate with users to improve the developer experience while establishing durable foundational capabilities
Requirements:
- 8+ years of software engineering experience with a track record of delivering quality results
- Hands-on experience building, deploying, operating, AND evaluating LLM agents in production — not just chat-completion apps or prototypes
- Experience with one or more agent frameworks/SDKs (Strands, OpenAI Agents SDK, Anthropic Claude Agent SDK, LangGraph, pydantic-ai, CrewAI, Google ADK) and with tool/function calling and MCP
- Experience with LLM/agent evaluation and observability — building eval suites, tracing, and quality measurement, then iterating on results (Braintrust, LangSmith, W&B, or equivalent)
- Strong experience building SDKs and APIs for internal or external developers
- Strong fundamentals in building and operating scalable, observable, fault-tolerant distributed systems
- Proficiency in Python (and Python packaging tooling) plus one of Java, Go, C/C++, Rust, or Zig
- Experience with large-scale build, release, CI/CD, and observability methods
- Familiarity with our stack — Temporal, FastAPI, PostgreSQL, Kubernetes — is a plus