Flex is a company that enables health and wellness brands to accept HSA and FSA payments online, simplifying benefit usage for consumers. They are seeking a Data Engineer to design and operate data and machine-learning systems that determine HSA/FSA eligibility at scale, collaborating with various stakeholders to build reliable data products and pipelines.

Responsibilities:

Design, build, and own the data pipelines and ML services that classify product eligibility and power downstream decisions across Flex
Model the data domain (products, merchants, eligibility rules, classifications, and outcomes) in warehouses and serving systems other teams build on
Partner with backend, product, and operations stakeholders translating merchant and consumer needs into reliable data products, models, and APIs
Own and improve the architecture of the data warehouse, transformation layer, ML training and inference systems, and real-time serving paths
Analyze, troubleshoot, and resolve production issues rooted in data quality, model accuracy, pipeline reliability, and serving latency
Collaborate on cross-functional projects connecting the full Flex experience, from consumer checkout to merchant analytics
Build and maintain evaluation harnesses, golden datasets, and observability for the models and pipelines you ship
Create and maintain documentation for data models, pipelines, and on-call run books
Contribute to a culture of learning, problem-solving, and operational excellence

Requirements:

5+ years building production data systems and pipelines in Python or a comparable typed language
Strong SQL and data-modeling fundamentals; experience with a modern cloud warehouse (Snowflake, BigQuery, Redshift, or similar) and a transformation framework like dbt
Hands-on experience deploying machine-learning models to production, owning training, inference, evaluation, and rollout, not just notebooks
Familiarity with at least one transformer-based ML framework (PyTorch + Hugging Face Transformers preferred) and a working sense of when classical or embedding-based models beat LLMs and when they don't
Resourceful, curious, and comfortable learning new tools quickly
Thrive in fast-paced, dynamic environments and enjoy wearing multiple hats
Collaborative and enjoy working across teams to solve problems
Execution mindset with focus on end users
Proficient at leveraging AI tools to ship faster
Experience with serverless compute platforms for data and ML workloads (Modal, Ray, AWS Lambda, GCP Cloud Run, or similar)
Production experience with vector databases and embedding-based retrieval
Self-hosted LLM inference experience (vLLM, TGI, SGLang) and a working sense of GPU economics
Background in payments, fintech, or health benefits (HSA/FSA), or another regulated, money-moving domain
Experience building or maintaining a golden-dataset evaluation harness for an ML system
Comfort reading and contributing to a Rust-based backend that consumes your APIs

Data Engineer

Key skills

About this role

Responsibilities:

Requirements: