Tempus AI is focused on advancing the healthcare industry through precision medicine and AI technology. They are seeking a Site Reliability Engineer to manage cloud infrastructure, automate tasks, and support teams in delivering high-quality software while ensuring system resilience and scalability.

Responsibilities:

Works as a member of an SRE team
Review application requirements and recommend solutions and common design patterns
Use your experience, deep technical knowledge, and creativity to simplify development and infrastructure provisioning workflows
Configure and deploy cloud infrastructure using Terraform and CI tools like Concourse
Proactively and continuously learn about new and relevant technologies
Use your knowledge to influence other developers and advocate for best practices
Implement dashboards, monitoring, and alerting for team services
Support your users either in person or via Slack

Requirements:

You have experience managing cloud infrastructure in AWS, GCP, or Azure
You have built and deployed containerized applications and services
You enjoy automating manual tasks
You have worked in agile environments and are comfortable iterating quickly
You enjoy collaborating and communicating with team members of varying skill sets and from different fields
Experience with one of the many infrastructure-as-code tools such as Terraform (our favorite), Kubernetes, CloudFormation, Docker, Ansible, Salt, Packer, Puppet, Chef, or similar
Familiarity with database design and tuning, especially with AWS Aurora MySQL and Postgresql
Familiarity with data workflow orchestration tools such as Composer and Dataproc
Proficiency with Bash and a programming language (Python, Ruby, Golang etc.)
Previous experience in the healthcare sector and securing infrastructure to standards or compliance frameworks such as HIPAA, HITRUST, or ISO

Site Reliability Engineer

Key skills

About this role

Responsibilities:

Requirements: