Amtex Systems Inc is seeking a highly skilled Data Engineer with deep expertise in Google Cloud Platform (GCP) and modern data architecture. The ideal candidate will design scalable data pipelines and implement Medallion Architecture while ensuring data security and performance optimization.
Responsibilities:
- Design, develop, and maintain scalable batch and real-time data pipelines on GCP
- Implement and manage Medallion Architecture (Bronze, Silver, Gold layers) for data processing
- Build high-performance data transformations using Python and PySpark
- Develop and optimize complex SQL queries for analytical workloads
- Work extensively with BigQuery for large-scale data processing and performance tuning
- Develop and deploy pipelines using Cloud Dataflow
- Orchestrate workflows using Cloud Composer (Apache Airflow)
- Manage data storage and lifecycle using Google Cloud Storage (GCS)
- Implement version control and CI/CD pipelines using Git-based tools
- Ensure data security, governance, and access control using GCP IAM
- Optimize data solutions for performance, scalability, reliability, and cost-efficiency
Requirements:
- Strong hands-on experience with Google Cloud Platform (GCP)
- Expertise in BigQuery (partitioning, clustering, query optimization)
- Proven experience implementing Medallion Data Architecture
- Strong programming skills in Python and PySpark
- Advanced proficiency in SQL (complex joins, window functions, performance tuning)
- Hands-on experience with Cloud Dataflow
- Experience with Cloud Composer (Airflow) for orchestration
- Experience working with Google Cloud Storage (GCS)
- Knowledge of version control systems (Git) and CI/CD practices
- Strong understanding of GCP IAM and cloud security best practices
- Experience working with large-scale enterprise data platforms
- Knowledge of data warehousing and data lake concepts
- Familiarity with real-time streaming frameworks
- Experience in data governance and data quality frameworks
- Exposure to Agile/Scrum methodologies