Qualitest is a leading AI-powered Quality Engineering company seeking a Data Engineer to join their growing U.S.-based team. The role involves building scalable data pipelines and collaborating with various teams to design reliable data solutions.
Responsibilities:
- Strong hands-on experience in Python and PySpark, with a solid understanding of Spark concepts, performance tuning, and scalable data processing
- Experience working with modern data management platforms such as Snowflake, Databricks, or similar cloud-based data platforms
- Good understanding of data pipeline design, development, orchestration, and monitoring, including batch and streaming data processing
- Familiarity with Medallion Architecture concepts, including bronze, silver, and gold layer design and best practices for data transformation
- Strong understanding of autoscaling, cluster optimization, and cost-efficient data processing in cloud or distributed data environments
- Experience with CI/CD, DevOps practices, and version control, with preference for candidates who have exposure to Terraform or infrastructure-as-code tools
- Ability to collaborate with business, analytics, and engineering teams to design reliable, reusable, and scalable data solutions
- Experience building end-to-end data pipelines (batch/streaming, orchestration, monitoring)
- Practical exposure to modern cloud data platforms (Databricks, Snowflake)
- Strong hands-on PySpark with Spark performance tuning (joins, partitioning, caching, optimization)
Requirements:
- Strong hands-on experience in Python and PySpark, with a solid understanding of Spark concepts, performance tuning, and scalable data processing
- Experience working with modern data management platforms such as Snowflake, Databricks, or similar cloud-based data platforms
- Good understanding of data pipeline design, development, orchestration, and monitoring, including batch and streaming data processing
- Familiarity with Medallion Architecture concepts, including bronze, silver, and gold layer design and best practices for data transformation
- Strong understanding of autoscaling, cluster optimization, and cost-efficient data processing in cloud or distributed data environments
- Experience with CI/CD, DevOps practices, and version control, with preference for candidates who have exposure to Terraform or infrastructure-as-code tools
- Ability to collaborate with business, analytics, and engineering teams to design reliable, reusable, and scalable data solutions
- Experience building end-to-end data pipelines (batch/streaming, orchestration, monitoring)
- Practical exposure to modern cloud data platforms (Databricks, Snowflake)
- Strong hands-on PySpark with Spark performance tuning (joins, partitioning, caching, optimization)