Design, implement, and maintain secure, highly available, and cost-efficient container orchestration platforms, including Kubernetes and ECS.
Develop and optimize Continuous Integration and Continuous Delivery (CI/CD) pipelines to streamline and enhance deployment processes, enabling high efficiency for Product Engineering teams.
Build and refine tools and patterns for monitoring and observability to strengthen failure detection, response, and recovery capabilities.
Collaborate with Technology Engineering, Development, and Product Management teams to scale, improve, and support production systems and services.
Partner with service teams to provide comprehensive documentation, knowledge sharing, architecture planning, capacity assessments, and recommendations for future optimizations.
Engineer solutions aimed at failure prevention and minimizing the likelihood of system issues.
Write clean, maintainable code, develop thorough test plans, and assess code quality while providing constructive feedback during code reviews.
Design and operationalize AI agent infrastructure: Deploy and manage orchestration frameworks (e.g., LangChain, AutoGen) on enterprise platform infrastructure to automate complex, multi-step workflows—including self-healing pipelines, infrastructure drift detection, and automated on-call triage running on Kubernetes.
Integrate AI coding assistants into platform engineering: Evaluate, configure, and govern AI-powered development tools (e.g., GitHub Copilot, Claude) to ensure secure, seamless integration within CI/CD pipelines, code review processes, and Infrastructure-as-Code (IaC) toolchains like Terraform and CloudFormation.
Requirements
Minimum of 2 years experience in a Platform Engineer or similar role
Proficiency in Python and/or Golang with a strong software engineering mindset
Experience managing and administering Linux systems
Experience with Docker and Kubernetes
Experience with AWS (EC2, RDS, Dynamo DB, Route53, Elastic Load Balancers, AMIs, IAM Roles, Ops Works, and Cloud Formation)
Knowledge of or experience building CI/CD pipelines
Experience with infrastructure as code concepts such as immutable and scalable infrastructure
Solid understanding of networking systems as well as identity and authorization mechanisms
Experience with using AI coding tools (GitHub, Copilot, Claude) to help Platform Engineering workflows
Understanding of Agentic AI and MCP Servers
Tech Stack
AWS
Cloud
Docker
DynamoDB
EC2
Kubernetes
Linux
Python
Terraform
Go
Benefits
Work in a collaborative environment with strong ambitions and goals