Own the technical vision and roadmap for GitLab’s core Workflow and Function Runtime primitives, in partnership with Product, Architecture, and other platform teams.
Lead and grow multiple engineering teams responsible for highly available, horizontally scalable, and secure distributed systems used across GitLab.
Be hands-on with architecture and code: review designs, dive into incidents, and contribute to critical paths where needed.
Drive operational excellence for these shared services, including SLOs, on-call, incident response, capacity planning, and resilience across regions and tenants.
Optimize for cost and efficiency at scale, balancing performance, reliability, and unit economics for long-running and bursty workloads.
Define and mature platform APIs and abstractions that allow product teams to compose workflows, schedule functions, and integrate with the runtime safely and predictably.
Create a strong engineering culture focused on results, iteration, ownership, and rigorous technical judgment.
Collaborate across the company (Product Management, Security, Infrastructure, Data, AI/ML, and other stage groups) to ensure the platform primitives meet the needs of diverse workloads.
Recruit, develop, and retain senior and staff-level engineers and managers; provide clear expectations, feedback, and growth paths.
Champion GitLab values (results, transparency, efficiency, collaboration, diversity & inclusion) in how the team plans, executes, and communicates.
Requirements
Proven experience leading teams building core platform or infrastructure services (e.g., workflow engines, function runtimes, control planes, high-scale microservices, or similar distributed systems).
Track record as a hands-on engineering leader: you’ve meaningfully contributed to architecture and design, and you are comfortable reading/writing code to unblock or clarify.
Strong background in scalable, multi-tenant distributed systems, including topics such as service decomposition, data partitioning, fault tolerance, and backpressure.
Demonstrated success operating mission-critical services in production (SLOs, on-call, incident management, postmortems, capacity management, chaos/DR testing).
Experience driving cost efficiency in cloud-native environments (profiling, performance tuning, right-sizing, storage and network optimization, and thoughtful use of managed services).
Familiarity with Kubernetes, modern cloud infrastructure (AWS/GCP/Azure), and event-driven/async architectures; experience with function-as-a-service or serverless runtimes is a plus.
People leadership experience managing managers and senior/staff engineers, with a record of hiring, coaching, and building inclusive, high-trust teams.
Ability to work effectively in a fully remote, globally distributed organization, with excellent written and asynchronous communication skills.
10+ years of professional software engineering experience, including 4–6+ years in engineering leadership (multi-team or org-level scope).
Tech Stack
AWS
Azure
Cloud
Distributed Systems
Google Cloud Platform
Kubernetes
Microservices
Benefits
Benefits to support your health, finances, and well-being
Flexible Paid Time Off
Team Member Resource Groups
Equity Compensation & Employee Stock Purchase Plan