Lahzo is a tech startup focused on enhancing revenue growth through advanced demand generation and AI technology. They are seeking a Senior Data Engineer I/II to manage and optimize their data infrastructure, ensuring data quality and reliability while supporting client onboarding and data transformations.
Responsibilities:
- Architecture & patterns — Design the ETL, transformation, and modeling patterns the team builds on. Make build-vs-buy calls and own the tradeoffs
- Mentorship & standards — Review PRs, set conventions, and level up the I/II engineers and analysts
- ETL pipeline development — Build and maintain data ingestion pipelines that move data reliably from source into the warehouse. Own the infrastructure end-to-end
- Data transformation and table logic — Build and maintain transformation models — client-specific and shared. Handle schema changes, new table configurations, and the ongoing queue of transformation requests
- Data quality and anomaly detection — Own data quality monitoring end-to-end: define what we monitor and to what SLA — not just tune thresholds — and decide where to spend the coverage budget. Extend coverage through assertions and automated alerting. Turn reactive monitoring into proactive coverage
- Client onboarding infrastructure — Every new Lahzo client gets a dedicated cloud project, service accounts, permissions, and registered data pipelines. You own this process from infrastructure provisioning to first clean pipeline run and own the architecture that makes it repeatable and increasingly self-service as we scale to dozens of clients
- Pipeline reliability and debugging — Understand the full data flow from raw event ingestion through final reporting tables. Debug issues across the stack end-to-end
- Ad hoc data requests — Own the complex, ambiguous requests and build the self-serve tooling that keeps the routine queue off engineering's plate
Requirements:
- 5+ years hands-on data engineering, with a track record of owning production data infrastructure end-to-end
- Strong SQL — production-quality, comfortable with complex aggregations, window functions, and multi-step transformations
- Data transformation experience — you have built and maintained SQL-based transformation pipelines across multiple environments (dev / staging / prod)
- Infrastructure as code — you can provision and manage cloud data infrastructure, set up permissions, and debug access issues without hand-holding
- Python for data engineering — ETL scripts, pipeline tooling, and automation
- Data-quality strategist — you've designed monitoring and alerting strategy, not just tuned an existing one
- Systematic debugger — when something breaks, you trace it end-to-end across the stack rather than stopping at the first symptom
- AI-fluent but grounded — you use AI tools to move faster and validate more thoroughly, and you still understand what is happening underneath. You are not chasing the next shiny tool instead of shipping
- Motivated by technical impact — you want to be the person who truly understands the systems, and you see growing expertise as the path to more interesting and higher-impact work
- Cost- and scale-aware — you think about partitioning, clustering, and spend before it's a problem
- A force multiplier — you make the people and systems around you better
- Dataform or dbt
- Terraform on GCP
- BigQuery — partitioning, clustering, cost optimization
- Data quality monitoring tools — Monte Carlo, Great Expectations, or similar
- Multi-tenant or per-client data isolation patterns
- Cloud Functions or Cloud Run for ETL pipelines
- A/B experiment pipelines or marketing attribution models
- Hex or similar self-serve analytics tooling — building data products that non-technical teams can use independently