Skild AI is building the world's first general purpose robotic intelligence, aiming to deploy robots within society through data-driven machine learning. The Senior Software Engineer will be responsible for building and scaling training infrastructure and tools that support the full machine learning lifecycle for real-world robotics applications.
Responsibilities:
- Architecting, building, and maintaining distributed training pipelines and frameworks spanning data ingest/preprocessing, large-scale training, and evaluation
- Optimizing training performance and resource utilization by identifying bottlenecks and implementing improvements in data loading, I/O, caching, sharding, and prefetching
- Integrating state-of-the-art ML techniques into production training systems in collaboration with research/ML teams
- Implementing monitoring, logging, alerting, automated testing, and CI/CD for reliable training operations
- Developing developer tooling and documentation, including dashboards and utilities, to streamline experimentation at scale and improve engineer productivity