Mastech Digital is seeking a Senior Cloud Data Engineer responsible for designing, developing, and implementing robust data pipelines for cloud platforms to support analytics and AI/ML needs. The role involves collaborating with various teams, ensuring data quality and security, and mentoring junior engineers.
Responsibilities:
- Design, develop, and implement robust, scalable, and secure data pipelines in a cloud environment
- Build and manage ETL/ELT processes to efficiently move and transform large datasets from multiple data sources
- Implement secure data access, encryption, and data masking policies
- Develop automated processes to validate data quality and data accuracy
- Document and maintain data workflows and diagrams
- Work with data scientists and AI specialists to automate model deployment of lifecycles (MLOps)
- Configure and maintain cloud-based data warehousing solutions
- Optimize data warehouse storage strategies to support analytics and data science needs
- Set up monitoring tools and alerts to maintain data warehouse availability and reliability
- Troubleshoot, profile, and optimize data pipelines for performance issues to minimize latency
- Work closely with data architects, data analysts and data scientists to understand their data needs and translate them into technical designs
- Mentor and guide junior data engineers, perform code reviews, and establish best practices for could data engineering
- Collaborate with DevOps and ITOps to implement CI/CD pipelines and robust DR strategies
Requirements:
- Bachelor's degree in computer science, Computer Engineering, Information Systems, or a related field
- 7+ years of experience in data engineering with a focus on cloud data engineering
- Profound understanding of major cloud platforms (AWS, GCP, Azure) and major cloud data platforms like Snowflake and Databricks
- Hands-on experience with data services offered by cloud platforms
- Expertise in programming languages such as Python, Java, or Scala with strong SQL
- Experience with ETL/ELT tools like Talend, DBT, Azure Data Factory, etc
- Experience with CI/CD tools like GitLab/GitHub
- Strong knowledge of data governance, data security, and compliance practices
- Experience supporting data science and machine learning operations
- Familiarity with data visualization and reporting tools (e.g., Power BI, Tableau)