Build production augmentation and profile models: Design, train, evaluate, and deploy models against real survey data, owning the full path from experiment to production.
Innovate on solutions: We aim to build the best simulation of survey responses in the industry. You'll design the experiments, baselines, and models to get us there.
Use the latest technology: Build on the best open-weights LLMs with fine-tuning, build new ones from scratch with custom architectures or use third party models through an API; whatever gets us the best result.
Partner with Data Science: Work with the data science team to ensure integrity of our statistical testing and experiment framework.
Leverage AI tooling: Use Claude Code and agentic programming tools where appropriate.

Excellent programming skills and be proficient in Python
Knowledge of Java is a plus
Knowledge of traditional machine learning tools and techniques (support vector machines, gradient boosting)
An understanding of LLMs and their architecture, ideally with experience in fine-tuning, e.g. LoRA, distillation
Proficient in SQL, knowledge of Databricks is a plus
Familiarity with cloud technologies, e.g. AWS
Outstanding problem-solving and analytical skills
Knowledge of maths, probability, statistics and algorithms

AI/ML Engineer – Synthetic Data

Key skills