Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. As a Senior DevOps Engineer, you will improve the reliability, security, and efficiency of the Optum Consumer Payment Network by leveraging modern cloud technologies and advancing DevOps culture across engineering teams.
Responsibilities:
- Enable teams to define, measure, and meet reliability goals (SLIs/SLOs) by strengthening post-incident learning, reducing alert noise, and helping teams create and maintain quality runbooks
- Build and enhance shared observability capabilities (metrics, monitoring, logging, dashboards, and alerting) to support >99.95% availability for business-critical applications
- Partner with software engineers across the organization to provide hands-on guidance by establishing patterns for engineering excellence initiatives (CI/CD, Infrastructure as code, zero-downtime deployments, automated remediation)
- Use AI-assisted tooling to improve engineering productivity (e.g., improve deployment reliability, incident analysis, automation, and documentation)
- Provide 24×7 production support via a rotating on-call schedule
- You will be recognized for performance and support with development opportunities
- Design, develop, and deploy AI-powered solutions to address complex business challenges with emphasis on responsible use of AI
Requirements:
- Bachelor's degree in Computer Science, Software Engineering, or IT related field
- 5+ years of experience with DevOps, security best practices, CI/CD, infrastructure code (IaC) and observability (e.g., GitHub, Datadog, New Relic or Dynatrace, Terraform, PagerDuty)
- 3+ years of experience operating production applications in hybrid environments (on-premises and public cloud), including Kubernetes-based workloads, in enterprise-scale production environments
- 2+ years of proficiency with a programming or scripting language for automation/tooling (e.g., .NET/C#, Java, Python, Go)
- 1+ years of experience with AIOps or AI-powered coding and analysis tools for faster RCA, alert noise reduction and anomaly detection
- Working knowledge of cloud networking, cloud security, containerization, disaster recovery, centralized logging, and monitoring
- Experience with cloud security controls such as DDoS protection, vulnerability management, and patching
- Experience with payment industry standards, protocols, and security best practices
- Strong foundation in Linux and/or Windows operating systems and troubleshooting tools