RadixArk is an infrastructure-first company focused on democratizing frontier-level AI infrastructure. They are seeking a Member of Technical Staff — Inference to design and build large-scale inference systems for AI models, optimizing performance and collaborating with various teams on performance-critical problems.

Responsibilities:

Design and build large-scale inference systems for frontier AI models
Optimize latency, throughput, and GPU utilization in production inference
Develop and improve model serving architectures and runtimes
Work on batching, scheduling, and memory management strategies
Collaborate with kernel, compiler, and systems teams on performance optimization
Debug performance bottlenecks across the stack
Drive reliability and scalability of inference infrastructure
Build tooling for observability, profiling, and performance analysis
Contribute to long-term inference architecture and strategy

Member of Technical Staff — Inference

Key skills

About this role

Responsibilities: