NVIDIA is a leading technology company known for its innovative approach to computing and AI. They are seeking a Senior Software Engineer to contribute to the NeMo Platform, focusing on building and improving AI systems through effective evaluation and infrastructure development.
Responsibilities:
- Design and implement Python-first APIs, SDK workflows, and plugin interfaces for building, measuring, and improving agents across multiple runtimes and product surfaces
- Build reusable systems for observing behavior, measuring progress, detecting regressions, and turning runtime evidence into product decisions
- Build systems for ingesting, normalizing, validating, and analyzing agent execution data and evaluation datasets
- Partner with research, product, platform, and infrastructure teams to integrate agentic capabilities broadly across NVIDIA agent runtimes and developer workflows
- Help turn emerging agent development and improvement techniques into reliable, reusable product capabilities
- Improve reliability, observability, debuggability, and performance across NeMoStack services, SDKs, plugins, jobs, and developer workflows
- Build strong test coverage across unit, integration, E2E, Docker, and Kubernetes workflows
- Drive “speed of light” engineering: fast iteration, high ownership, pragmatic decisions, and performance-minded implementation under production constraints
- Provide senior technical leadership through design reviews, code reviews, mentoring, and ownership of ambiguous cross-component problems
Requirements:
- BS, MS, or equivalent experience in Computer Science, Computer Engineering, or a related technical field
- 5+ years of professional software engineering experience building production systems
- Excellent Python engineering skills, including API design, typing, testing, debugging, performance analysis, and maintainable software design
- Experience designing SDKs, libraries, plugins, CLIs, or other developer-facing interfaces
- Experience with distributed systems, cloud-native services, containers, Kubernetes, or job orchestration
- Strong understanding of reliability, scalability, security, and performance tradeoffs in production infrastructure
- Experience with structured data modeling and validation systems such as Pydantic, typed schemas, event/trace models, or SDK-generated types
- Ability to work independently, define technical scope, break down ambiguous problems, and drive work across team boundaries
- Clear communication skills and a track record of collaborating with engineering, product, research, or customer-facing teams
- Experience building, deploying, and iterating on production agentic AI systems where evaluation was used to measure and improve real product outcomes
- Experience designing evaluation workflows for heterogeneous agents, including tool-using agents, RAG agents, workflow agents, coding agents, or long-running autonomous systems
- Experience integrating evaluation capabilities across multiple products, runtimes, or internal platforms, especially through Python SDKs, plugins, or shared developer tooling
- Strong ability to connect technical evaluation work to business outcomes, product quality, user experience, reliability, or operational efficiency
- Experience with enterprise AI systems where measurement, regression testing, observability, governance, and continuous improvement are required for production deployment