Contrarian Thinking is building the infrastructure layer for modern entrepreneurs, and they are seeking a Data Engineer to transform messy data into reliable, trustworthy information. The role involves owning the data layer, building ELT pipelines, and ensuring data quality across various teams and products.
Responsibilities:
- Build and own ELT pipelines that sync data from HubSpot and other sources into GCP (Fivetran and comparable tools), reliably and on schedule
- Model raw data into clean, trusted, analysis-ready tables in dbt (staging, intermediate, and mart layers)
- Design and maintain the warehouse across PostgreSQL and BigQuery, tuned for the questions our products and teams actually ask
- Handle messy, ambiguous, real-world data: reconcile inconsistencies across sources and build pipelines that fail loudly, not silently
- Run reverse ETL to push modeled data back into HubSpot and other business tools, so teams work off the latest state
- Own data quality and observability: dbt tests, freshness checks, monitoring, and alerting that catch problems before anyone downstream notices
- Define and maintain the core metrics and the logic underneath them, so definitions stay consistent and 'my numbers don't match yours' goes away
- Keep the data that powers our products reliable and on time, including safe schema changes and migrations
- Automate the repetitive parts of the pipeline so they run themselves
Requirements:
- 5+ years building and maintaining data pipelines and warehouses in production
- Strong with dbt: transformation layers, tests, materializations, and well-structured projects (staging, intermediate, and mart layers)
- Hands-on with ELT tools like Fivetran (or Airbyte, Stitch, custom Python) to sync sources like HubSpot into a warehouse
- Deep BigQuery experience (or comparable warehouses), including performance tuning, partitioning, and cost awareness
- Comfortable with PostgreSQL for both transactional and analytical workloads
- Strong SQL as your primary language, plus Python for scripting and automation
- A track record with messy, ambiguous, real-world data: investigating quality issues, reconciling sources, and handling edge cases without silent failures
- Comfortable with Git and CI/CD for data pipelines
- You care about outcomes: data people trust, pipelines that don't break, numbers that match
- You must be available during US business hours, 9am to 5pm Central Time (CST/CDT), on weekdays
- Built an ingestion and transformation pipeline from scratch, from sources to trusted marts
- Synced HubSpot or another CRM into a warehouse and kept it reliable
- Set up reverse ETL (Census, Hightouch, or similar) to push data back into business tools
- Experience with data orchestration (Airflow, Dagster, Prefect)
- Built data quality and freshness monitoring that caught problems before stakeholders did
- High-growth startup, solo builder, or high-ownership environment