DataDirect Networks (DDN) is a global leader in AI and multi-cloud data management at scale, seeking a highly experienced Senior Staff Engineer specializing in AI Data Path & Storage. The role involves leading the development and integration of advanced storage systems with AI inference pipelines, focusing on high-performance data movement and system optimization.
Responsibilities:
- Lead the design and implementation of high-performance data movement pipelines using NVIDIA NIXL across GPU, CPU, and storage tiers
- Architect and drive integration of DDN Infinia with GPU-accelerated inference platforms for large-scale, real-time AI workloads
- Own end-to-end optimization of I/O paths between GPU memory and storage using technologies such as NVIDIA GPUDirect Storage, RDMA, and NVMe-over-Fabrics
- Define and implement multi-tier storage architectures (NVMe, SSD, object storage) optimized for inference latency, throughput, and scalability
- Lead development of advanced KV cache management strategies, including offloading, prefetching, and persistence across distributed storage layers
- Partner with AI/ML engineering teams to optimize inference performance in frameworks such as PyTorch and TensorFlow
- Establish benchmarking frameworks and lead performance tuning efforts for storage and data movement in production inference environments
- Diagnose and resolve complex system bottlenecks across storage, networking, and GPU subsystems
- Influence architecture decisions for distributed inference systems, ensuring scalability, resilience, and efficient data locality
- Drive engineering excellence through best practices in observability, performance monitoring, automation, and reliability engineering
- Mentor junior engineers and provide technical leadership across cross-functional teams
Requirements:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field
- 12+ years of experience in storage systems, distributed systems, or performance engineering
- Proven track record of architecting and delivering large-scale, high-performance infrastructure systems
- Deep expertise in distributed storage architectures (object storage, scalable file systems, or cloud-native storage platforms)
- Strong understanding of Linux I/O stack, filesystem internals, and storage protocols
- Extensive hands-on experience with NVMe, SSD optimization, and high-performance storage environments
- Strong experience with RDMA, InfiniBand, or other high-speed data transfer technologies
- Solid understanding of GPU computing concepts and CPU–GPU data movement patterns
- Proficiency in Python and/or C/C++, with advanced debugging, profiling, and performance tuning skills
- Demonstrated ability to optimize latency-sensitive, high-throughput production systems
- Hands-on experience with NVIDIA NIXL or similar data movement frameworks
- Experience with GPU-aware storage pipelines and GPUDirect Storage
- Strong understanding of AI inference systems, LLM serving architectures, and KV cache optimization
- Experience with Retrieval-Augmented Generation (RAG) pipelines and open vector search ecosystems
- Background in high-performance computing (HPC) or hyperscale distributed environments
- Expertise in caching strategies, memory tiering, and data locality optimization
- Experience designing disaggregated compute and storage architectures