Develop and maintain end-to-end data pipelines and backend ingestion workflows, and participate in the build of Samsara's Data Platform to enable advanced automation and analytics.
Work with data from a variety of sources including ERP(Netsuite), CRM(Salesforce), Product, Order Flow, and Support ticket data.
Manage critical data pipelines to enable growth initiatives and advanced analytics.
Facilitate data integration and transformation for moving data between applications, ensuring interoperability with data layers and the data lake.
Develop and improve data architecture, data quality, monitoring, observability, and data availability.
Write data transformations in SQL/Python to generate data products consumed by Analytics, Marketing Operations, and Sales Operations teams.
Design, build, and operate large-scale Spark and PySpark workflows for batch and streaming data processing across Databricks and cloud environments.
Optimize Spark job performance — tuning partitioning, shuffle, caching, and resource allocation for production-grade reliability and efficiency.
Define and enforce data engineering standards, patterns, and best practices across the team.
Design systems with long-term maintainability in mind: clear contracts, testable components, and thoughtful failure modes.
Collaborate with platform and infrastructure teams to evolve the underlying architecture of Samsara's enterprise data ecosystem.
Build and maintain MCP (Model Context Protocol) servers that expose Samsara's data assets and engineering workflows to AI models and internal tooling.
Collaborate with platform teams to integrate agentic workflows into the data engineering lifecycle.
Evaluate and adopt emerging AI-native tooling for data engineering, staying ahead of the curve on how LLMs and agents can accelerate data work.
Champion, role model, and embed Samsara's cultural principles (Focus on Customer Success, Build for the Long Term, Adopt a Growth Mindset, Be Inclusive, Win as a Team) as we scale globally.
Requirements
Bachelor's degree in computer science, data engineering, data science, information technology, or an equivalent engineering program.
8+ years of work experience as a Software Engineer with data focus or as Data Engineer.
5+ years of experience building and maintaining large-scale, production-grade end-to-end data pipelines, including Data Modeling.
5+ years of hands-on Spark / PySpark in a production environment, including job optimization and performance tuning.
Core Engineering Fundamentals: Strong programming capabilities in Python and SQL, combined with cloud data warehouse/lakehouse experience (e.g., Snowflake, Google BigQuery, Databricks, or Apache Iceberg).
Exposure to ETL tools such as Fivetran, DBT, or equivalent.
API experience: Python-based API frameworks for data pipeline ingestion.
RDBMS experience: MySQL, AWS RDS/Aurora, PostgreSQL, Oracle, MS SQL Server, or equivalent.
Cloud: AWS, Azure, and/or GCP.
Tech Stack
Apache
AWS
Azure
BigQuery
Cloud
ERP
ETL
Google Cloud Platform
MS SQL Server
MySQL
Oracle
Postgres
PySpark
Python
RDBMS
Spark
SQL
Benefits
competitive salary
initial RSU grant with no vesting cliff
ongoing refresh opportunities tied to performance
flexible, employee-led remote model
professional development stipend
comprehensive health and parental leave plans
total rewards and benefits designed to fuel high-impact builders