Stack AV is developing revolutionary AI and advanced autonomous systems to enhance safety and efficiency in the trucking transportation industry. As a Labeling Infrastructure Backend Engineer, you will design and scale systems that manage massive datasets and integrate with machine learning pipelines.
Responsibilities:
- Architect Scalable Systems: Design and maintain robust, distributed backend services capable of managing millions of labeling tasks and high-throughput data streams with minimal latency
- Optimize Data Pipelines: Build and refine processes to ingest raw data from diverse sources and deliver high-quality labeled outputs to model training environments
- ML Integration: Implement "Active Learning" and "Model-in-the-loop" features, enabling automated pre-labeling and intelligent task prioritization to maximize human efficiency
- API Development: Develop and document clean, performant APIs that serve as the bridge between our front-end labeling tools, third-party vendors, and internal ML platforms
- Infrastructure & DevOps: Manage cloud-native infrastructure to ensure 99.9+% availability, focusing on observability, automated scaling, and cost-efficiency
- Cross-Functional Collaboration: Partner with AI team and Product Managers to translate complex data requirements into technical specifications and durable backend solutions
- Play a key role in designing and building the next generation of Stack’s Labeling Infrastructure
- Implement robust architecture to track the lifecycle of every log from unlabeled to production-ready
- Build scalable backend services for automated QA to and AI-assisted labeling plugins
- Write high quality Python and SQL
Requirements:
- Proven track record of building scalable, reliable infrastructure in a fast-paced environment
- Ability to collaborate effectively across teams
- Strong development experience with Python and SQL
- Prior experience with Trino, Flyte / Airflow, and Kubernetes are a plus
- Prior experience with ML Ops workflows is a plus
- Prior experience building and managing data platforms for multimodal ML needs is a plus
- Prior experience with agentic workflows is a plus
- Prior experience in autonomous vehicles (AV) is a plus