Tempus AI is focused on advancing the healthcare industry through precision medicine and AI technology. They are seeking a Site Reliability Engineer to manage cloud infrastructure, automate tasks, and support teams in delivering high-quality software while ensuring system resilience and scalability.
Responsibilities:
- Works as a member of an SRE team
- Review application requirements and recommend solutions and common design patterns
- Use your experience, deep technical knowledge, and creativity to simplify development and infrastructure provisioning workflows
- Configure and deploy cloud infrastructure using Terraform and CI tools like Concourse
- Proactively and continuously learn about new and relevant technologies
- Use your knowledge to influence other developers and advocate for best practices
- Implement dashboards, monitoring, and alerting for team services
- Support your users either in person or via Slack
Requirements:
- You have experience managing cloud infrastructure in AWS, GCP, or Azure
- You have built and deployed containerized applications and services
- You enjoy automating manual tasks
- You have worked in agile environments and are comfortable iterating quickly
- You enjoy collaborating and communicating with team members of varying skill sets and from different fields
- Experience with one of the many infrastructure-as-code tools such as Terraform (our favorite), Kubernetes, CloudFormation, Docker, Ansible, Salt, Packer, Puppet, Chef, or similar
- Familiarity with database design and tuning, especially with AWS Aurora MySQL and Postgresql
- Familiarity with data workflow orchestration tools such as Composer and Dataproc
- Proficiency with Bash and a programming language (Python, Ruby, Golang etc.)
- Previous experience in the healthcare sector and securing infrastructure to standards or compliance frameworks such as HIPAA, HITRUST, or ISO