SilverSearch, Inc. is a highly recognized organization seeking a Senior Machine Learning Engineer to design, build, and optimize production machine learning systems. This role focuses on developing and tuning inference pipelines for multimodal content, ensuring efficient processing of text, image, and video data.

Responsibilities:

Designing, building, and optimizing ML-powered inference systems supporting text, image, and video workloads
Developing scalable pipelines for embeddings, semantic search, vector retrieval, reranking, and multimodal processing
Optimizing inference performance across transformer-based NLP and/or computer vision models, including tuning for latency, throughput, batching, concurrency, and memory efficiency
Supporting large-scale distributed inference workloads across hybrid CPU/GPU environments and cloud infrastructure (AWS preferred)
Building resilient asynchronous processing systems with strong observability, fault tolerance, logging, retries, caching, and performance monitoring
Partnering with engineering and data science teams to continuously improve production model performance and deployment reliability

Requirements:

Experience building and scaling inference pipelines in production environments
Experience improving latency, throughput, memory utilization, and model-serving efficiency across distributed workloads
Strong hands-on experience with technologies such as PyTorch, TensorFlow, transformer models, semantic/vector search, embeddings, retrieval systems, distributed inference, and production ML optimization
Experience designing, building, and optimizing ML-powered inference systems supporting text, image, and video workloads
Experience developing scalable pipelines for embeddings, semantic search, vector retrieval, reranking, and multimodal processing
Experience optimizing inference performance across transformer-based NLP and/or computer vision models, including tuning for latency, throughput, batching, concurrency, and memory efficiency
Experience supporting large-scale distributed inference workloads across hybrid CPU/GPU environments and cloud infrastructure (AWS preferred)
Experience building resilient asynchronous processing systems with strong observability, fault tolerance, logging, retries, caching, and performance monitoring
Experience partnering with engineering and data science teams to continuously improve production model performance and deployment reliability
Experience with video processing workflows
Experience with multimodal AI systems
Experience with large-scale inference environments

Senior Machine Learning Engineer

Key skills

About this role

Responsibilities:

Requirements: