Design , build, and operate cloud infrastructure for critical production and non ‑ production applications with reliability and simplicity as primary goals
Architect and evolve multi-account AWS foundations (organizations, accounts, IAM boundaries, guardrails, and environment separation) to enable secure, scalable delivery
Design and operate cloud networking architecture (VPCs, routing, segmentation, ingress/egress, connectivity patterns) to support reliability, security, and compliance requirements
Treat reliability, security, and compliance as first ‑ class design concerns throughout the system lifecycle
Build tooling and automation that reduces errors, shortens recovery time, and improves day ‑ to ‑ day operations
Implement monitoring, logging, and alerting that make system behavior observable and actionable
Use AI ‑ assisted tools to accelerate infrastructure delivery, automation, troubleshooting, and root ‑ cause analysis, applying engineering judgment to validate outcomes
Implement reliability guardrails for releases (progressive delivery, safe rollbacks, change risk controls) and provide production support during deployments.
Participate in incident response, perform root cause analysis, and drive durable improvements that prevent recurrence
Work closely with application engineers to co ‑ own system design, operation, and continuous improvement
Maintain clear, lightweight documentation that supports shared ownership and effective on ‑ call operations
Requirements
BA/BS, in a related technical field; or the equivalent in education and work experience
8+ years of experience in DevOps, SRE, platform engineering, or similar roles supporting application teams running production services
Strong CI/CD experience (Jenkins and Git-based workflows preferred), including building secure, reliable pipelines and enabling teams to ship safely
Experience implementing and operating observability platforms (logging/metrics/alerting); Elastic Stack/OpenSearch experience is a plus
Hands-on, demonstrable experience designing and operating AWS environments, and enabling application teams to adopt AWS correctly (networking, IAM, security, reliability, and cost awareness)
Infrastructure as Code experience (Terraform preferred; CloudFormation acceptable), including building reusable modules/patterns and managing changes through review and automation
Experience supporting CI/CD builds and deployment patterns for common application stacks (for example Java, NodeJS, and .NET)
Experience scripting in Bash, Python, or PowerShell
Experience working on large scale cloud-based web applications
Tech Stack
AWS
Cloud
Java
Jenkins
Node.js
Python
Terraform
.NET
Benefits
comprehensive care for your physical and mental well-being