You will be responsible for DevOps & Platform Engineering
Own the design and evolution of CI/CD pipeline architecture, governance, and standards
Modernize and automate deployment pipelines for Kotlin-based AWS Lambda services using GitHub Actions
Standardize infrastructure and deployment processes across services
Reduce manual deployment effort through automation
Lead implementation of disaster recovery, high availability, and fault-tolerant designs
Automate infrastructure provisioning and lifecycle management to reduce manual work
Build and maintain end-to-end observability (metrics, logging, tracing, alerting)
Establish effective alerting that reduces noise and ensures high-signal incident detection
Proactively identify and address system risks before they impact customers
Lead incident response in shared on-call rotation (triage, mitigation, communication)
Drive root cause analysis and blameless postmortems to prevent recurrence
Own and govern the release process, including deployment gates and approvals
Review and approve deployment plans to ensure quality and stability
Optimize the build and release lifecycle for speed, consistency, and reliability
Manage cross-repo dependencies and versioning strategies
Lead remediation of security vulnerabilities, collaborating with the Security team as needed
Establish processes to proactively prevent new security risks
Embed secure development and deployment practices into pipelines
Guide the development team toward reliability and security best practices
Proactively identify issues, drive visibility, and ensure timely resolution with engineers
Stay up to date with industry trends and emerging technologies to drive innovation
Communicate technical concepts clearly to both technical and non-technical stakeholders
Requirements
7+ years of experience in DevOps, SRE, Platform Engineering, or similar roles focused on CI/CD, cloud infrastructure, and system reliability
Strong experience with: AWS Serverless architectures, Terraform and CloudFormation, CI/CD pipelines (GitHub Actions preferred), Azure Entra ID, OAuth2, OpenID Connect, Maven build tooling, Git-based version control workflows (GitHub preferred)
Proven ability to Design and optimize deployment pipelines
Troubleshoot complex distributed systems
Make data-driven decisions
Translate business requirements into scalable technical solutions
Strong communication, collaboration, and organizational skills
Understanding of system design patterns for reliability and scalability
Nice to have… Experience with Kotlin or Java development
Experience with NoSQL databases (for example DynamoDB) and relational databases (for example PostgreSQL)
Experience working in Agile or Scrum environments
Familiarity with artifact management tools such as JFrog Artifactory
Experience defining and managing SLAs, SLOs, and SLIs
Experience with distributed tracing tools such as AWS X-Ray or OpenTelemetry
Experience using AI tools or AI agents to improve development, automation, or operational Workflows
Tech Stack
AWS
Azure
Cloud
Distributed Systems
DynamoDB
Java
Kotlin
Maven
NoSQL
Postgres
Ray
Terraform
Benefits
Health / Dental / Vision Benefits Day-One
5% matching 401k
Additional benefits including but not limited to financial support, pet insurance, mental health resources, volunteer paid days off, employee stock program, foundation donation matching, and much more!