Architect Multi-Modal Vision Systems: Design and train models that fuse 2D inputs with 3D geometry to solve complex grasping and scene understanding tasks.
Lead End-to-End Model Deployment: Own the transition from research to reality, including model graph optimization, quantization (TensorRT), and runtime integration.
Drive Technical Excellence: Conduct rigorous code reviews, mentor junior engineers, and contribute to the strategic perception roadmap.
Own the Data Strategy: Take ownership of existing labeled dataset and pipeline, identifying bottlenecks and improving data quality.
Ensure Production Reliability: Write high-performance production code (Python/C++) to integrate perception outputs into the broader robotic control stack, prioritizing safety and stability.
Requirements
5+ Years of Experience in Computer Vision and Machine Learning, with a track record of shipping ML products to the physical world (Robotics, AV, or IoT).
Expert-level Python and PyTorch skills. Working knowledge of C++ for deployment and system integration.
Experience with 2D Vision (YOLO, MaskRCNN, Transformers) and 3D Vision (PointNet, grasp generation, multi-view geometry, camera calibration).
Proficient with inference optimization tools such as TensorRT, ONNX Runtime, or CUDA to maximize hardware utilization.
Experience curating large-scale datasets, detecting statistical bias, and automating quality assurance within the ML pipeline.
Ability to translate high-level product requirements into specific engineering tasks and explain technical trade-offs to non-expert stakeholders.
Familiarity with Docker, AWS/GCP (S3, EC2), labeling platforms and experiment tracking tools.
Tech Stack
AWS
Docker
EC2
Google Cloud Platform
IoT
Python
PyTorch
Benefits
health, dental, & vision insurance
unlimited vacation
401K contributions of 5%
travel supplies
other items to make your working life more fun, comfortable, and productive