Own and evolve large-scale ML pipelines powering Spotify’s content-resolution systems
Lead development of multimodal embedding frameworks supporting multimodal understanding, music video matching, SongDNA
Improve entity-resolution systems across music and video content, helping Spotify better understand relationships between recordings, versions, and content formats
Design and run experiments to improve precision, recall, and overall content-quality outcomes using offline evaluation, golden datasets, A/B testing, and impact analysis
Build scalable ML evaluation and monitoring infrastructure, including standardized datasets, retraining workflows, and continuous improvement systems
Contribute to the evolution of the Music Knowledge Graph by improving production ML capabilities, observability, and model lifecycle management
Partner closely with Product Managers, Data Scientists, and engineering teams across Content Platform and the wider Experience Mission
Help shape technical strategy for the squad and contribute to long-term ML direction across the product area
Mentor engineers and contribute to a strong culture of technical collaboration and experimentation
Requirements
solid experience building, deploying, and maintaining machine learning systems in production at scale
strong experience training, evaluating, and operating ML models using modern frameworks such as PyTorch or TensorFlow
experience working with multimodal machine learning systems across audio, computer vision, text embeddings, or related domains
understanding of entity resolution, deduplication, record linkage, or large-scale matching problems, ideally across multiple content modalities
design evaluation systems that balance model quality, operational performance, and real-world impact
experienced working with large-scale distributed data processing systems and ML infrastructure
communicate effectively across engineering, product, and data science stakeholders
comfortable leading technical initiatives and influencing engineering direction within a team
experience with Scio, Dataflow, Flyte, BigQuery, or similar distributed processing frameworks is a plus
experience with Scala is a plus
experience with computer vision, video understanding, multimodal embeddings, or recommendation systems is a strong plus