About this role

Harnham is a scaling AI‑driven technology company building large‑scale, production‑grade ML systems used in real‑time decisioning. They are seeking a Senior Machine Learning Engineer who enjoys shaping platform architecture while being hands‑on, focusing on production ML with an emphasis on distributed compute and reliability.

Responsibilities:

Designing, training, and deploying high‑scale ML models used in live systems
Building distributed training pipelines (PyTorch, Ray)
Owning the ML lifecycle across feature engineering, training, evaluation, inference, monitoring
Improving ML reliability, observability, and reproducibility
Working closely with engineering, SRE, and product to shape platform direction
Contributing to ML architecture standards, CI/CD, and testing frameworks

Requirements:

Strong experience delivering production ML systems end‑to‑end
Expertise with Python, PyTorch, distributed compute (Ray, Spark)
Background in large‑scale data processing and MLOps tooling
Ability to diagnose production issues and drive architectural improvements
Experience with event‑driven ML and model deployment frameworks is a plus

Principal Machine Learning Engineer - ML Platform

Key skills

About this role

Responsibilities:

Requirements: