Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The Lead Software/DevOps Engineer will collaborate with software developers and operations teams, monitor software performance, and mentor junior engineers while ensuring system readiness and best practices for cloud services.
Responsibilities:
- Collaborating with software developers, engineers, and operations teams
- Monitoring sites and software to make sure they're performing properly (including on-call shifts)
- Anticipating potential problems before they occur (and coming up with solutions)
- Conducting post-incident reviews
- Documenting your work to turn findings into repeatable actions
- Hands on Developer
- Mentoring and coaching junior engineers
- Conduct regular system audits and capacity planning exercises to identify areas for improvement and ensure readiness for future growth
- Participate in on-call rotations and respond to incidents in a timely manner, ensuring quick resolution and effective communication with stakeholders
- Establish and maintain best practices for monitoring, logging, and alerting using tools like Datadog, Prometheus, and Grafana
- Configure and maintain services such as load balancers, relational & NoSQL databases, and messaging systems while ensuring high availability and performance
- Design, develop, and deploy AI-powered solutions to address complex business challenges with emphasis on responsible use of AI
Requirements:
- Bachelor's degree in CS or IT or engineering related field
- 10+ years of experience in object-oriented programming language JAVA
- 5+ years of experience as a Lead Software Engineer, DevOps Engineer or in IT Operations
- 3+ years of experience with any one public cloud platform like AWS or Azure or GCP
- 2+ years of experience with container technologies like Docker and Kubernetes
- 1+ years of experience with automation and scripting tools such as Python, Bash, PowerShell, and Perl
- Excellent communication and interpersonal skills, with the ability to work collaboratively with development teams, stakeholders, and management
- Experience in problem-solving skills on complex technical issues and a proactive attitude towards identifying and addressing potential issues
- Experience with public cloud platforms, hybrid cloud environments, and migration strategies
- Experience with REST API design, micro services, and event driven architecture
- Experience with configuration and deployment management tools such as Ansible, Terraform
- Experience with configuration and maintenance of services such as load balancers, relational & NoSQL databases, and messaging systems
- Experience in monitoring and alerting tools such as Datadog, Prometheus, and Grafana
- Experience with incident response and post-mortem analysis
- Demonstrated excellent communication and interpersonal skills, with the ability to work collaboratively with development teams, stakeholders, and management