Egen is a fast-growing and entrepreneurial company with a data-first mindset, seeking a Lead Data Engineer to architect and build modern, cloud-native data platforms on Google Cloud. The role involves leading the design of data pipelines, driving data quality practices, and mentoring engineers while solving complex data challenges.
Responsibilities:
- Architect and optimize large-scale data platforms on Google Cloud, with BigQuery as the analytical backbone
- Design and build unified batch and streaming pipelines that handle high-volume, mission-critical workloads
- Lead infrastructure-as-code practices, ensuring environments are repeatable, secure, and version-controlled
- Implement open table formats to enable cross-cloud and cross-engine data interoperability
- Establish automated data quality, metadata, and lineage practices across the data estate
- Partner with data scientists, analysts, and product teams to translate business needs into reliable data products
- Mentor engineers, review designs, and raise the bar on engineering standards
Requirements:
- 7+ years in data engineering, with at least 2 years in a lead or senior individual contributor capacity on Google Cloud-based platforms
- BigQuery (Advanced): Deep knowledge of BigQuery architecture, including partitioning, clustering, slot management, storage optimization, and query execution tuning
- Streaming & Batch Pipelines: Strong hands-on experience building unified pipelines using Dataflow (Apache Beam), Dataproc, and Pub/Sub
- Infrastructure as Code: Production experience developing and managing cloud infrastructure with Terraform
- Open Table Formats: Working knowledge of Apache Iceberg, including its role in enabling cloud and engine interoperability (e.g., across BigQuery, Spark, Snowflake)
- Data Governance: Experience with Dataplex and Data Catalog for automated data quality checks, metadata tagging, and column-level lineage from source to destination
- Experience leading or mentoring data engineering teams
- Familiarity with CI/CD for data pipelines (Cloud Build, GitHub Actions)
- Exposure to multi-cloud or hybrid data architectures
- Background in regulated industries (healthcare, financial services) where governance and lineage are critical