vCluster Labs is a venture-backed tech startup pioneering Kubernetes virtualization for the AI era. As a Senior Software Engineer (vMetal), you will build systems that manage physical hardware, enabling customers to create tenant clusters. Your work will involve architecture decisions, developing Go services, and ensuring customer-driven reliability.
Responsibilities:
- VMetal: You will contribute to architecture decisions, help drive the roadmap for bare metal provisioning and lifecycle management, and hold a high bar for code review and design quality on the team
- Bare Metal Programmability: Build the Go services that turn raw hardware into APIs our customers can consume. You will design and ship the systems that drive Redfish, IPMI, and PXE workflows in production, not glue scripts, real services with clean interfaces and solid tests
- Hardware Lifecycle Automation: Own how servers get discovered, inventoried, provisioned, configured, and reclaimed. You will eliminate manual intervention from the day-2 path and design for hardware that fails in surprising ways
- Cross-Generational Architecture: Translate between traditional out-of-band server management and modern Kubernetes-native patterns. You will contribute to where the abstractions live and how vMetal exposes hardware to tenant clusters cleanly
- Customer-Driven Reliability: Partner with customer engineering and the broader platform team to debug, harden, and ship against real production workloads. You will be on-call for the systems you build and you will treat reliability as a first-class deliverable
Requirements:
- You write production Go for a living
- You can design clean services, APIs, and libraries, not just script around someone else's code
- You have shipped systems that drive servers via Redfish and IPMI in production
- You understand PXE boot end-to-end
- You have debugged what happens when a BMC lies to you
- You have built or operated bare-metal-as-a-service offerings, an AI Cloud, or a hyperscaler bare metal team where infrastructure was the product
- You hold both server management and modern API design in your head and can design between them
- You are comfortable in IPMI, iDRAC, and ILO consoles, and equally comfortable shipping a Go controller
- You think about failure modes, telemetry, and recoverability before you ship
- You have been on-call for the systems you have built and you treat that as a feature, not a tax
- Comfort with Kubernetes internals, controllers, or operators, enough to design how bare metal hosts integrate cleanly with tenant clusters
- Production experience inside an AI Cloud, hyperscaler, or large platform team where bare metal scale was non-negotiable
- Meaningful contributions to projects in the bare metal, provisioning, or Kubernetes ecosystem, such as Tinkerbell, Metal3, Cluster API, or Ironic