Lead and grow a small DevOps engineering team supporting Unite Services and Unite Integrations.
Drive team planning and execution across operational efficiency, security, reliability, and platform maturity epics (e.g., quarterly OKRs/initiatives).
Provide technical mentorship on Kubernetes, GitHub, CI/CD, and cloud infrastructure.
Collaborate closely with Engineering, SRE, Security, and Ops on roadmap, incident resolution, and cross-team initiatives.
Own the operational health of Unite Services and Unite Integrations: Lead production deployments and release processes for backend services (e.g., Unite Notifications, integrations platform, Salesforce-related services).
Oversee multi-environment support (DEV/QA/PROD and special environments like RTE).
Operate and evolve Kubernetes clusters (cloud and on‑prem), including: Cluster upgrades and migration playbooks; Ingress/Nginx and service mesh / networking changes; Resource/capacity optimization (CPU throttling, scaling, etc.).
Manage supporting infrastructure for Unite platforms: Redis, Consul, Harbor, Chart Museum, API gateways, message queues (e.g., RabbitMQ).
Lead migration and consolidation of repos and pipelines to GitHub (from TFS/Azure DevOps and others).
Requirements
Strong background in platform / DevOps engineering for complex distributed systems.
Deep understanding of infrastructure components: compute, networking, storage, DNS, certificates, load balancing, and how they interact.
Deep knowledge of Kubernetes (cluster operations, upgrades, scaling, security, ingress, RBAC).
Experience with Docker/containerization in production.
Solid experience with at least one major cloud provider (Azure preferred; AWS or GCP a plus).
Comfort working in hybrid environments (cloud + on‑prem).
Strong experience designing and running CI/CD pipelines.
Hands-on with GitHub Actions and self-hosted runners (including scaling and security).
Proficiency with IaC tools (Terraform, Crossplane, Helm, etc.).
Experience with tools like Ansible, Chef, or similar for configuration management and automation.
Experience with Prometheus, Grafana and centralized logging stacks (e.g., ELK, Splunk or similar).
Understanding of security best practices across network, infrastructure, and application layers.
Proven experience leading DevOps / platform teams or being a strong technical lead.
Ability to mentor engineers and raise the technical bar for the team.
Experience leading small/medium-sized projects using Agile/Scrum or similar methodologies.
Strong communication skills, able to translate technical constraints into clear options and decisions for stakeholders.
Demonstrated ability to own outcomes: from design through implementation, rollout, and ongoing operations.
Strong troubleshooting skills and calm under pressure during incidents.
Tech Stack
Ansible
AWS
Azure
Chef
Cloud
Consul
Distributed Systems
DNS
Docker
Google Cloud Platform
Grafana
Kubernetes
NGINX
Prometheus
Python
RabbitMQ
Redis
Splunk
Terraform
TFS
Go
Benefits
We hire, promote, and compensate employees based on their ability to perform their job responsibilities, without regard to race, color, creed, religion, sex, gender, marital status, national origin, ancestry, age, citizenship, physical or mental disability, sexual orientation, or any other basis protected by applicable law (collectively referred to in our Code of Conduct as “Protected Classes”). We do not tolerate employment discrimination in the workplace, and we are committed to making reasonable accommodations for identified disabilities or other limitations as required by all applicable laws. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.