Mozilla Corporation is a non-profit-backed technology company that has shaped the internet for the better over the last 25 years. As a Senior Data Engineer, you will manage the data lifecycle, transforming raw data into reliable models and ensuring data quality while collaborating with various teams to drive data-informed decisions.
Responsibilities:
- Build the pipes: You’ll build and manage our business data pipelines as well as transform Firefox telemetry data into structured high quality datasets Mozillians can rely on
- Make data make sense: You’ll partner with data scientists, product, and marketing teams to turn complex datasets into the models and metrics that drive Mozilla’s decisions
- Be a guardian of data quality: You’ll ensure our datasets stay accurate and performant by using our observability tools to monitor quality and joining a weekly triage rotation to resolve data issues as they arise
- Evolve our platform: You’ll partner with other data engineers and platform engineers to improve how we work
- Maintain data integrity: You’ll ensure our data governance and privacy policies are implemented and enforced across the tens of terabytes of data flowing through our systems daily
Requirements:
- At a minimum, 4 years of professional experience in data engineering
- Proficiency in SQL and Python, and an eagerness to integrate AI into data engineering workflows to help us scale
- Experience mapping complex business processes into extensible, analytical data models. You have a clear vision for long-term architecture but prefer an incremental approach that delivers immediate value while allowing our models to evolve
- You build modular, reusable code and efficient logic that remains performant when processing data at our scale
- You are comfortable taking a high level goal and turning it into a plan. You have a track record of owning projects from start to finish, managing your own blockers, and keeping your team in the loop
- You possess strong communication skills and the ability to work effectively with a distributed team across different time zones
- You should have proficiency in one or more of the areas listed below, and be interested in learning about the others: A track record of recommending and implementing new data collection methods to improve data model quality, You have used data to answer specific questions and guide company decisions, You have experience with cloud distributed systems (we use GCP) including distributed databases, message queues or batch / stream processing, You ensure data is safe to depend on through smart orchestration, error handling, and idempotency
- Commitment to our values: Welcoming differences, Being relationship-minded, Practicing responsible participation, Having grit