Reddit is a community-driven platform that facilitates open conversations and information sharing. They are seeking a Senior Staff Software Engineer to lead the development of their ML Indexing & Retrieval systems, focusing on building scalable and reliable platforms that enhance machine learning capabilities. The role involves collaborating with cross-functional teams and mentoring engineers to ensure the infrastructure meets the growing needs of Reddit's user base.
Responsibilities:
- Lead the technical strategy, architecture, and implementation of Reddit’s next-generation ML Indexing & Retrieval engine, integrating capabilities across lexical and vector indexing, low-latency retrieval, and emerging GenAI applications
- Partner closely with product engineers across Content Understanding, Search, Feeds, Ads, Growth, and Safety to deliver high-quality experiences
- Define best practices for observability, reliability, and operational excellence in large-scale distributed systems
- Mentor and guide engineers in designing scalable infrastructure and adopting robust DevOps and SRE principles
- Collaborate with infrastructure, and ML teams to ensure the platform evolves to meet the needs of Reddit’s growing user base and diverse content ecosystem
Requirements:
- 10+ years of experience in software engineering, specializing in Indexing and Retrieval systems
- 3+ years in technical leadership, architecting and scaling distributed systems in production environments
- Deep expertise in large-scale data platforms, including batch indexing and stream processing
- Proven experience designing and operating large-scale, low-latency retrieval services
- Expertise in lexical and vector search retrieval technologies, such as Milvus, Vespa, or Elasticsearch
- Skilled in designing cloud-native architectures and managing containerized workloads using Kubernetes and AWS/GCP
- Adept at translating complex technical challenges into clear, actionable strategies
- Strong communicator and mentor who leads through collaboration, influence, and technical excellence
- Languages: Go, Java, Python, or any object oriented programming language
- Frameworks: Flink, Airflow, Spark for large scale batch & stream processing
- Databases: Familiarity with Vector, Lexical & Key-Value Databases
- Tools: Kubernetes, Docker, AWS, GCP