GENNTE Technologies is seeking an experienced and visionary Senior Full-Stack Data Engineer to lead the architecture, development, and optimization of a next-generation data platform. This role involves defining technical direction, mentoring team members, and delivering high-impact data solutions in a fast-paced environment.
Responsibilities:
- Platform Strategy & Leadership
- Technical Direction: Define and champion the architectural roadmap and best practices for our end-to-end data pipelines, ensuring scalability, reliability, and security across the platform
- Team Mentorship & Project Velocity: Act as a primary technical mentor, guiding a team of engineers, conducting code reviews, and aggressively driving the project timeline to ensure rapid delivery of data products
- Stakeholder Collaboration: Partner with Data Scientists, Analysts, and business stakeholders to translate complex requirements into robust, production-ready data solutions
- Collaboration with Data Scientists and ML Engineers: Data Accessibility, Support for Model Development, Data Quality Assurance
- Data Pipeline Development & Management
- Ingestion & Transformation: Design, build, and optimize high-volume data ingestion and transformation jobs using tools like dbt Core, AWS Glue, or Flexter, ensuring data quality and integrity
- Workflow Orchestration: Develop and maintain sophisticated data pipelines using orchestrators such as Dagster or Talend, focusing on modularity and reusability
- Streaming & Real-time Integration: Implement and manage real-time data flows utilizing Confluent platforms or native AWS streaming services (e.g., Kinesis) for immediate data availability
- Data Security and Privacy: Data Anonymization, Compliance with Regulations
- Be well versed with DataOps and DevOps fundamentals
- Assist and drive the Data Ecosystem Management & Monitoring
- Open Table Formats & Management: Implement and maintain the Iceberg open table format, utilizing tools like Upsolver (Talend Open Lakehouse) for efficient schema evolution and data management
- Compute Engine Optimization: Optimize query performance and cost efficiency across our primary compute engines: Snowflake, Amazon Redshift, and AWS Athena
- Observability & Monitoring: Integrate comprehensive monitoring and observability into all pipelines using Splunk to ensure high availability, rapidly identify bottlenecks, and troubleshoot production issues
Requirements:
- 15+ Years of hands-on, progressive experience in Data Engineering, Data Architecture, or a closely related Full-Stack Data role
- Deep conceptual understanding of core data engineering principles, including data modeling (e.g., Dimensional, Data Vault), ETL/ELT patterns, and metadata management
- Proven track record of building and managing petabyte-scale data infrastructure in a cloud-native environment
- Be well versed with DataOps and DevOps fundamentals
- Strong SQL, Pyspark and Python
- Experience with Talend, dbt Core, Iceberg, AWS Glue Catalog, Snowflake, Redshift, Athena, Splunk, AWS streaming services, Git
- Technical Direction: Define and champion the architectural roadmap and best practices for our end-to-end data pipelines, ensuring scalability, reliability, and security across the platform
- Team Mentorship & Project Velocity: Act as a primary technical mentor, guiding a team of engineers, conducting code reviews, and aggressively driving the project timeline to ensure rapid delivery of data products
- Stakeholder Collaboration: Partner with Data Scientists, Analysts, and business stakeholders to translate complex requirements into robust, production-ready data solutions
- Ingestion & Transformation: Design, build, and optimize high-volume data ingestion and transformation jobs using tools like dbt Core, AWS Glue, or Flexter, ensuring data quality and integrity
- Workflow Orchestration: Develop and maintain sophisticated data pipelines using orchestrators such as Dagster or Talend, focusing on modularity and reusability
- Streaming & Real-time Integration: Implement and manage real-time data flows utilizing Confluent platforms or native AWS streaming services (e.g., Kinesis) for immediate data availability
- Data Security and Privacy: Data Anonymization, Compliance with Regulations
- Assist and drive the Data Ecosystem Management & Monitoring
- Open Table Formats & Management: Implement and maintain the Iceberg open table format, utilizing tools like Upsolver (Talend Open Lakehouse) for efficient schema evolution and data management
- Compute Engine Optimization: Optimize query performance and cost efficiency across our primary compute engines: Snowflake, Amazon Redshift, and AWS Athena
- Observability & Monitoring: Integrate comprehensive monitoring and observability into all pipelines using Splunk to ensure high availability, rapidly identify bottlenecks, and troubleshoot production issues
- Insurance industry experience preferred but not mandatory