NVIDIA is seeking a High-Performance LLM Training Engineer to improve the efficiency of LLM training workloads. The role involves optimizing NVIDIA’s high-performance LLM software stack and shaping hardware roadmaps for future GPUs.
Responsibilities:
- Understand, analyze, profile, and optimize AI training workloads on innovative hardware and software platforms
- Understand the big picture of training performance on GPUs, prioritizing and then solving problems across all state-of-the-art neural networks
- Implement production-quality software in multiple layers of NVIDIA's deep learning platform stack, from drivers to DL frameworks
- Build and support NVIDIA submissions to the MLPerf Training benchmark suite
- Implement key DL training workloads in NVIDIA's proprietary processor and system simulators to enable future architecture studies
- Build tools to automate workload analysis, workload optimization, and other critical workflows