Shutterfly is a company that helps customers create products and capture moments that reflect their unique selves. They are seeking a Principal Data Engineer to tackle scalability and performance challenges in data processing and storage for their eCommerce business, enhancing the Data Warehouse on AWS to support various analytic needs.
Responsibilities:
- Own & build design, develop, test, deploy, maintain and enhance full-stack data engineering solutions for the Data Pipelines & Data Mart encompassing the Data Warehouse
- Provide technical leadership to both internal Data Warehouse team as well as to publishers & subscribers of the Shutterfly's Enterprise Data Lake
- Identify, evaluate and evangelize through data-based evidence improvements to the Data Lake as we as the data processing environment; hence influence the data strategy
- With your technical expertise, own and manage project priorities, deadlines and deliverables
- Always with a customer focus, evangelize the benefits of existing solutions and new technologies to drive the use and push the technology of the Data Warehouse forward
- Work closely with Data Operations to improve CI/CD pipelines, as well as continually improve the operations, monitoring and performance of the Data Warehouse
- Work across multiple teams in high visibility roles and own solutions end-to-end
Requirements:
- Expert knowledge of Python, Spark, and SQL; experience with large-scale data processing and distributed systems
- 10+ years of hands-on experience building data platforms, including data pipelines, data warehousing, and feature engineering systems
- Proven experience championing the adoption of AI-powered tools to increase team productivity, reduce manual effort, and improve operational efficiency
- Strong foundation in data structures, algorithms, and system design for large-scale data and AI systems
- Experience with AWS ecosystem (e.g., S3, EMR, Glue, Lambda, SageMaker) or equivalent cloud platforms (GCP, Azure)
- Hands-on experience with Databricks and modern lakehouse architectures
- Familiarity with real-time data streaming frameworks
- Experience supporting Machine Learning and AI workloads, including feature engineering, model data pipelines, and training data preparation
- Bachelor's / master's degree in computer science or equivalent