Save A Lot is seeking a Principal Data Engineer to lead the design, development, and operation of platforms and pipelines that support their data science capabilities. This role involves a combination of data engineering and data science, requiring collaboration with various teams to ensure reliable data flow and effective AI/ML implementation.
Responsibilities:
- Define the long-term technical direction for the data science platform and integration with existing ELT pipelines
- Ensure platforms are scalable, reliable, secure, and cost-efficient at enterprise scale
- Evaluate and adopt emerging tools in the modern data and ML stack
- Design, develop, and optimize ETL pipelines and outbound data feeds
- Develop and follow templates and engineering patterns to reduce the time-to-deploy new data assets or changes to an existing data model or analytics solutions
- Partner with key business teams to understand their data needs and assist them in building appropriate data solutions to meet their business needs
- Design, build, and optimize end-to-end data science pipelines — from raw data ingestion through feature engineering, model training, and inference serving
- Contribute to MLOps practices including model versioning and monitoring, supporting the transition of data science work into production
- Provide technical guidance to data engineers
- Conduct code reviews and champion engineering best practices across workstreams
- Lead without direct authority, influencing cross-functional teams across data engineering, analytics and product owners
- Establish best practices for data quality, lineage, privacy, and security across data engineering and science pipelines
- Ensure model inputs and outputs are auditable, reproducible, and compliant with data governance standards
- Partner with data engineering, product owners, and software engineers to align platform capabilities with organizational AI/ML goals
- Translate complex technical concepts into clear, actionable insights for non-technical stakeholders
Requirements:
- Bachelor's degree in computer science, engineering, mathematics, or a related field, OR 7+ years of equivalent verifiable experience, skillset, and record of accomplishment
- Experience in a Principal or Senior Data Engineer role with direct involvement in ML platform or Data Science work
- Proficiency in an analytics/BI tool such as Power BI
- Modern data stack technologies — Databricks (strongly preferred), Snowflake, Spark
- Inbound/outbound transportation of data with APIs and FTPs
- MPP databases such as Databricks, Snowflake, BigQuery, Teradata, or Azure Synapse
- Cloud platforms — AWS, Azure, or GCP
- Python and SQL
- Building and deploying ML models (classification, regression, forecasting, NLP, or similar)
- Familiarity with ML frameworks such as scikit-learn, XGBoost, PyTorch, or TensorFlow
- MLflow or similar tools for experiment tracking, model registry, and deployment
- Understanding of feature engineering, model evaluation, and common ML failure modes
- Strong understanding of data modelling techniques (Kimball, Data Vault) and distributed systems
- Familiarity with feature stores, training pipelines, and batch/real-time inference architectures