Amazon RedshiftAWSCloudCyber SecurityDockerETLGrafanaPythonRubyRuby on RailsSDLCGoELTRedshiftRailsGitHub ActionsServerlessECSFargateCloudFormationLambdaRDSCloudWatchAthenaCodePipelineCodeDeployOtelGitHubAgile
About this role
Role Overview
Own and develop ZayZoon's infrastructure-as-code using CloudFormation, with an emphasis on serverless resources (ECS, Fargate, Lambda)
Instrument and analyze daily metrics across both infrastructure performance and our Ruby on Rails applications, using AWS tooling (Athena, CloudTrail) and third-party observability platforms (Grafana, OTel, CloudWatch)
Build, optimize, and maintain efficient pipelines (GitHub Actions, CodeDeploy, CodePipeline) to accelerate developer velocity, including modern deployment strategies like blue/green deployments and intelligent auto-scaling
Stay ahead of resource dependencies, particularly databases (RDS, Redshift), including upgrades, playbooks, and downtime planning
Work closely with application developers to serve all of their infrastructure needs. Turning repeatable needs like spinning up environments, running jobs etc. into platform services that can be used by devs whenever they need them.
Project costs and implement AWS cost savings programs and reserved instances
Partner with our risk and security teams to maintain SOC-2 and cybersecurity compliance, and actively evaluate and remediate Critical and High CVEs across all services
Collaborate extensively with app developers on shared metrics, database performance, and load testing
Collaborate extensively with data engineers to facilitate data warehouse development, ELT, and ETL
Participate in our agile process: sprint planning, story grooming, and standup
Champion our SDLC and secure coding practices across everything you ship
Requirements
5+ years of cloud infrastructure engineering experience, with deep, production-level AWS expertise
2+ years of AWS experience including certification and deployment of production applications
Strong proficiency with IaC, specifically CloudFormation
Hands-on experience with containerization (Docker, ECS, ECR)
Experience with Python for Dev tooling and scripting
Experience analyzing and acting on performance issues using observability platforms like OTel, and building dashboards with Grafana
A bias toward building POCs quickly, figuring out what works, and then levelling up to quality, scalable features when the MVP becomes core functionality
A deep desire to track and trace everything so we can find incident root causes before they happen, and the sleeves-rolled-up attitude to help the team dig in when something does go wrong
Tech Stack
Amazon Redshift
AWS
Cloud
Cyber Security
Docker
ETL
Grafana
Python
Ruby
Ruby on Rails
SDLC
Benefits
Permanently Remote: Work from a desk, a coffee shop, or in the great outdoors
our jobs are fully remote, forever
Flexible Time Off: Whether it's a longer vacation to explore new horizons, a series of short breaks for regular rejuvenation, or stepping away for a new level of mastery in a skillset – our “You-do-You” time off program caters to the diverse and evolving lifestyles of our team with a maximum of 6 weeks vacation
Instant Benefits: All full-time employees get access to medical, vision, and dental benefits from their very first day including increased mental health coverage and a wellness stipend
Plus: Inclusive parental leave top-up, earned wage access, real time market data for salaries, a supportive culture for lifelong learners and more