Home
Jobs
Saved
Resumes
Customer Site Reliability Engineer – OpenShift Managed Cloud Services, Kubernetes, AWS, Azure, Linux at Red Hat | JobVerse
JobVerse
Home
Jobs
Recruiters
Companies
Pricing
Blog
Jobs
/
Customer Site Reliability Engineer – OpenShift Managed Cloud Services, Kubernetes, AWS, Azure, Linux
Red Hat
Remote
Website
LinkedIn
Customer Site Reliability Engineer – OpenShift Managed Cloud Services, Kubernetes, AWS, Azure, Linux
India
Full Time
1 hour ago
Visa Sponsorship
Apply Now
Key skills
Ansible
AWS
Azure
Cloud
Distributed Systems
Google Cloud Platform
Kubernetes
Linux
OpenShift
Prometheus
TCP/IP
Terraform
Go
Golang
GCP
Google Cloud
Leadership
Collaboration
About this role
Role Overview
Manage large-scale, distributed systems, focusing on minimizing downtime and improving system resilience.
Maintain customer trust and confidence by ensuring stability and functionality of services.
Drive continuous enhancement of processes, tools, and methodologies to support the evolving needs of the service.
Lead the development of code and automation scripts to optimize the scalability, reliability, and performance of services.
Lead and participate in high-priority customer escalations, adopting a customer-first mindset.
Coordinate and execute complex incident response procedures, ensuring timely resolution and thorough postmortems.
Collaborate with cross-functional teams to enhance system robustness.
Document resolutions, root causes, and best practices to enrich the knowledge base and promote self-service solutions.
Mentor and coach team members, fostering a culture of continuous learning, knowledge sharing and collaboration.
Participate in on-call rotation and provide leadership during critical incidents.
Requirements
Advanced Experience with OpenShift/Kubernetes container platform support or administration.
Proficient with container-based technologies on Linux.
Proficient in managing Linux-based systems in a public cloud such as AWS, Azure, or GCP.
Advanced experience with enterprise systems monitoring; knowledge of Prometheus is preferred.
Advanced with enterprise configuration management such as Ansible, Terraform.
Software engineering experience using object-oriented languages; golang is preferred.
Demonstrated ability to quickly and accurately troubleshoot systems issues.
Solid understanding of standard TCP/IP networking and common protocols.
Fluent in English and any additional language like Japanese, Chinese, Korean, Spanish is an advantage.
Tech Stack
Ansible
AWS
Azure
Cloud
Distributed Systems
Google Cloud Platform
Kubernetes
Linux
OpenShift
Prometheus
TCP/IP
Terraform
Go
Benefits
Flexible work arrangements
Professional development opportunities
Apply Now
Home
Jobs
Saved
Resumes