Samsara is the pioneer of the Connected Operations™ Cloud, helping organizations improve their physical operations through IoT data. They are seeking a Senior Data Engineer to design and maintain data pipelines that transform source data into actionable insights for analytics and model training.
Responsibilities:
- Build and maintain highly reliable computed tables, incorporating data from various sources, including unstructured data like video and audio, Samsara sensor & product data, and customer metadata
- Access, manipulate, and integrate external datasets with internal data
- Deliver high-quality data with strong uptime and reliability requirements, including customer-facing data sets
- Collaborate closely with cross-functional teams such as Data Science & Analytics, AI/ML, and other Data Engineers to ensure high-quality data for diverse purposes from causal inference, model training, and dashboarding
- Champion, role model, and embed Samsara’s cultural principles (Focus on Customer Success, Build for the Long Term, Adopt a Growth Mindset, Be Inclusive, Win as a Team) as we scale globally and across new offices
Requirements:
- BA / MS degree in Computer Science, Statistics, or a related discipline
- 4+ years experience in a data engineering-focused role
- Demonstrated experience in designing data models at scale
- Proficiency in building ETL pipelines to handle large volumes of data
- Experience with Spark-based data platforms
- Strong command of at least one data orchestration tool (e.g Airflow, Dagster, or Prefect)
- Expertise in SQL, Python, and working with REST APIs
- Familiarity with software engineering fundamentals and reading backend development code
- Experience with version control systems such as Git/GitHub
- Familiarity with time series data and late-arriving data
- Knowledge of Databricks, Delta Lakes, and Dagster
- Previous experience working in a public cloud (e.g AWS, GCP, Azure)
- Exposure working on a data model for a product's first-party data
- Exposure to complex data, including ML outputs and/or client-side signals