Cimpress is a global leader in mass customization businesses, empowering over 17 million customers through personalized products. The Senior Data Engineer will be responsible for handling large data volumes, orchestrating ETL/data pipelines, and leveraging machine learning and AI tools to improve data engineering tasks.
Responsibilities:
- 3+ years of experience handling large data volumes and orchestrating and monitoring automated Batch & Near Real-Time ETL/data pipelines using CI/CD and Cloud Technologies, with preferred expertise in DBT or DBT Cloud
- Strong programming skills in Python and SQL
- Solid experience with MPP Data Warehouse systems such as Snowflake or Amazon Redshift, along with cloud platforms including AWS (Preferred), Azure, or GCP
- Expertise in Data Modelling and Data Warehousing best practices, with a strong ability to adopt development best practices such as modularization, testing, and refactoring
- Experience with Business Intelligence and reporting tools such as Looker is an added advantage
- Hands-on experience with Machine Learning (ML) and MLOps practices, including model training, versioning, deployment, monitoring, and lifecycle management using tools like MLflow, SageMaker, or Vertex AI
- Ability to leverage AI tools in day-to-day data engineering tasks, such as using GitHub Copilot or Claude for code generation, debugging, query optimization, and pipeline development to improve productivity and efficiency
- Exposure to Generative AI tools and frameworks (e.g., LangGraph, LangChain, OpenAI APIs)
- Curiosity to explore and implement evolving data engineering and Generative AI technologies
- Understanding of modern practices like DataOps, MLOps or equivalent experience
- Strong problem-solving skills with a solid understanding of data structures, algorithms, and the ability to thrive in ambiguous environments with minimal oversight
Requirements:
- 3+ years of experience handling large data volumes and orchestrating and monitoring automated Batch & Near Real-Time ETL/data pipelines using CI/CD and Cloud Technologies, with preferred expertise in DBT or DBT Cloud
- Strong programming skills in Python and SQL
- Solid experience with MPP Data Warehouse systems such as Snowflake or Amazon Redshift, along with cloud platforms including AWS (Preferred), Azure, or GCP
- Expertise in Data Modelling and Data Warehousing best practices, with a strong ability to adopt development best practices such as modularization, testing, and refactoring
- Experience with Business Intelligence and reporting tools such as Looker is an added advantage
- Hands-on experience with Machine Learning (ML) and MLOps practices, including model training, versioning, deployment, monitoring, and lifecycle management using tools like MLflow, SageMaker, or Vertex AI
- Ability to leverage AI tools in day-to-day data engineering tasks, such as using GitHub Copilot or Claude for code generation, debugging, query optimization, and pipeline development to improve productivity and efficiency
- Exposure to Generative AI tools and frameworks (e.g., LangGraph, LangChain, OpenAI APIs)
- Curiosity to explore and implement evolving data engineering and Generative AI technologies
- Understanding of modern practices like DataOps, MLOps or equivalent experience
- Strong problem-solving skills with a solid understanding of data structures, algorithms, and the ability to thrive in ambiguous environments with minimal oversight