Close is a bootstrapped, profitable company focused on building a communication-focused CRM for small scaling businesses. The Site Reliability Engineer will join the Infrastructure Team to build and maintain the platform that runs all Close systems, ensuring stability and efficiency in a fully automated environment.
Responsibilities:
- Fully automating our database’s lifecycles with Argo Workflow
- Eliminating all static credentials where they may be
- Reducing downtime and disruption due to maintenance or disaster to new lows
- Help us improve our multi-region disaster recovery system
Requirements:
- 5+ years of experience building modern infrastructure systems for Senior 1 & 2 level candidates
- 8+ years of experience for Staff level candidates
- You are respected as an expert on the systems you run
- You have been the final point of escalation in the support of mission critical production systems
- Familiarity with some of the following technologies: AWS, Terraform, Kubernetes, Ansible, MongoDB, PostgreSQL, Elasticsearch
- Strong grasp of common networking and data transfer protocols such as DNS, HTTP, TCP
- Able to speak and write in English
- Located in the USA (ET, CT, MT, PT)
- Contributed open source code related to our tech stack
- Experience maintaining very large databases
- Has been through a successful disaster response
- Experience with multi-region architectures
- Run MLOps systems
- Experience scaling Temporal