Lemurian Labs is reimagining the foundations of computing to make AI accessible to everyone, and they are seeking a Runtime Engineer to design and build the multi-target runtime for their AI compiler stack. This role involves working on low-level parallelization, kernel scheduling, and performance analysis to enhance the efficiency and scalability of their systems-level software.
Responsibilities:
- Design, develop, maintain, and improve our multi-target runtime
- Apply the latest techniques in parallelization and partitioning to automate kernel generation and exploit highly optimized execution paths
- Rapidly prototype and data-drive exploration of new runtime ideas
- Benchmark and analyze the outputs produced by our optimizing compiler on target hardware
- Build tools to collect and analyze performance bottlenecks
- Work closely with our product team to understand the evolving needs of ML engineers and drive improvements in runtime architecture