Triple Whale is a complete intelligence platform for ecommerce, helping brands make data-driven decisions to drive growth and efficiency. They are seeking a Senior Cloud Backend Engineer to join their Infrastructure Team, focusing on building reliable and scalable systems, supporting service infrastructure, and participating in on-call rotations for platform reliability.

Responsibilities:

Deploy and support our service infrastructure in Kubernetes
Identify the right tools and technologies for major initiatives and then build them
Help other teams and developers design robust and scalable systems
Scale and optimize multiple databases
Build internal tooling that accelerates developer velocity
Provide observability, monitoring, and visibility across our systems
You will participate in a shared on-call rotation, typically 2–3 times per month, covering the period from Friday at 7:00 AM ET through Saturday at 5:00 PM ET
During your rotation, you are the primary point of escalation for production issues
On-call is not a daily responsibility, only during your assigned weekends
Strong understanding of system architecture and cross-service dependencies
Previous real-world experience in production on-call environments
Ability to quickly assess incidents, identify scope/root cause, and understand platform impact
Ability to classify severity and prioritize response appropriately
Capability to deploy safe production hotfixes when needed
Solid judgment under pressure - especially when operating independently
Ownership mindset: from detection to mitigation to resolution

Requirements:

You are located in the New York tri-state area
3+ years of experience as an independent backend or infrastructure engineer
Ability to design and build scalable, reliable systems
Strong communication skills
Hands-on builder mentality — this is a coding role
Experience with relational and non-relational databases
Experience with major Cloud platforms (GCP, AWS, Azure), GCP is an advantage
Experience with streaming systems
Experience with scaling large systems
Experience with message queues
Experience with monitoring systems like DataDog, Grafana, Groundcover
Experience with CI/CD, Git
Strong understanding of system architecture and cross-service dependencies
Previous real-world experience in production on-call environments
Ability to quickly assess incidents, identify scope/root cause, and understand platform impact
Ability to classify severity and prioritize response appropriately
Capability to deploy safe production hotfixes when needed
Solid judgment under pressure - especially when operating independently
Ownership mindset: from detection to mitigation to resolution
Kubernetes and Knative (production experience)
ClickHouse

Senior Cloud/Backend Engineer

Key skills

About this role

Responsibilities:

Requirements: