About this role

PySpark Mastery: Production-level expertise in Apache Spark using Python (PySpark)
Must understand Spark internals (DAGs, shuffling, memory management, and optimization techniques)
Data Modeling: Proven track record of building complex data models from scratch (Star/Snowflake schemas, Data Vault, or 3NF)
Database & SQL: Expert-level proficiency in SQL
Extensive hands-on experience with massive relational databases (e.g., Oracle, PostgreSQL) and modern data warehouses/lakes (e.g., Snowflake, BigQuery, or Hive/Hadoop)
Systems Design: Clear understanding of distributed systems processing, ETL/ELT design patterns, and enterprise data warehousing principles
Communication: Demonstrated ability to translate complex technical concepts into clear, concise language for non-technical stakeholders and business leaders
We are seeking a highly experienced PySpark Developer to lead the design, architecture, and development of mission-critical data pipelines and enterprise data models
As a senior technical leader, you will bridge the gap between complex business requirements and highly scalable data architecture
The ideal candidate possesses deep expertise in PySpark, advanced SQL optimization, and enterprise data modeling
You will not only be a hands-on technical contributor but also serve as an architectural guide, mentoring junior developers, establishing best practices, and ensuring that data solutions are highly performant, resilient, and aligned with Client
Lead the transition from legacy data structures to modern, scalable cloud/hybrid

Pyspark Developer

Key skills

About this role