Support and operate Legion’s AWS-based cloud platform and Kubernetes (EKS) environments.
Leverage GenAI tools (e.g., Claude Code, Codex, or similar) to accelerate infrastructure development, automation, and auto-remediation of common production issues.
Build and maintain infrastructure-as-code using Terraform.
Develop automation and internal tooling using Go or Python.
Improve CI/CD pipelines to increase deployment safety and velocity.
Define and improve monitoring, alerting, and observability systems.
Respond to production incidents, conduct root cause analysis, and implement systemic improvements.
Develop and automate operational runbooks and remediation workflows.
Support production deployments, including during off-hours as needed.
Requirements
8+ years of experience in SRE, DevOps, or SaaS production operations.
5+ years of hands-on experience operating large scale production workloads in AWS.
Strong experience with Terraform and infrastructure-as-code practices.
5+ years of experience with containerized environments using Docker and Kubernetes (EKS preferred); familiarity with Helm.
Proficiency in Go or Python (or similar programming language).
Experience building and maintaining CI/CD systems (Git-based workflows, Argo, Jenkins or similar).
Strong Linux/Unix systems experience.
Bachelor’s degree in Computer Science or equivalent practical experience.