Flex is a company that enables health and wellness brands to accept HSA and FSA payments online, simplifying benefit usage for consumers. They are seeking a Data Engineer to design and operate data and machine-learning systems that determine HSA/FSA eligibility at scale, collaborating with various stakeholders to build reliable data products and pipelines.
Responsibilities:
- Design, build, and own the data pipelines and ML services that classify product eligibility and power downstream decisions across Flex
- Model the data domain (products, merchants, eligibility rules, classifications, and outcomes) in warehouses and serving systems other teams build on
- Partner with backend, product, and operations stakeholders translating merchant and consumer needs into reliable data products, models, and APIs
- Own and improve the architecture of the data warehouse, transformation layer, ML training and inference systems, and real-time serving paths
- Analyze, troubleshoot, and resolve production issues rooted in data quality, model accuracy, pipeline reliability, and serving latency
- Collaborate on cross-functional projects connecting the full Flex experience, from consumer checkout to merchant analytics
- Build and maintain evaluation harnesses, golden datasets, and observability for the models and pipelines you ship
- Create and maintain documentation for data models, pipelines, and on-call run books
- Contribute to a culture of learning, problem-solving, and operational excellence
Requirements:
- 5+ years building production data systems and pipelines in Python or a comparable typed language
- Strong SQL and data-modeling fundamentals; experience with a modern cloud warehouse (Snowflake, BigQuery, Redshift, or similar) and a transformation framework like dbt
- Hands-on experience deploying machine-learning models to production, owning training, inference, evaluation, and rollout, not just notebooks
- Familiarity with at least one transformer-based ML framework (PyTorch + Hugging Face Transformers preferred) and a working sense of when classical or embedding-based models beat LLMs and when they don't
- Resourceful, curious, and comfortable learning new tools quickly
- Thrive in fast-paced, dynamic environments and enjoy wearing multiple hats
- Collaborative and enjoy working across teams to solve problems
- Execution mindset with focus on end users
- Proficient at leveraging AI tools to ship faster
- Experience with serverless compute platforms for data and ML workloads (Modal, Ray, AWS Lambda, GCP Cloud Run, or similar)
- Production experience with vector databases and embedding-based retrieval
- Self-hosted LLM inference experience (vLLM, TGI, SGLang) and a working sense of GPU economics
- Background in payments, fintech, or health benefits (HSA/FSA), or another regulated, money-moving domain
- Experience building or maintaining a golden-dataset evaluation harness for an ML system
- Comfort reading and contributing to a Rust-based backend that consumes your APIs