Build automated, scalable data pipelines to deliver data and innovative products for our clients;
Develop efficient ideas and solutions involving data extraction and centralization, data quality assurance and governance, and creation of new business metrics;
Deploy data pipelines in AWS environments;
Perform code reviews for the data engineering team;
Participate in project roadmap discussions with the client;
Recommend improvements and optimizations;
Identify new opportunities and potential projects with the client.
Requirements
Advanced programming skills in Python, PySpark, APIs, and SDKs;
Experience with complex ETL processes and observability of data pipelines;
Experience with relational databases, best practices, query optimization, SQL, stored procedures, and data warehousing (DW);
Experience with Airflow, Docker, Kubernetes, infrastructure configuration/management, application deployment, and DevOps practices;
Strong familiarity with AWS services such as S3, DynamoDB, EMR, Athena, Lambda, Redshift, Kinesis, EKS, and Glue.
Tech Stack
Airflow
Amazon Redshift
AWS
Docker
DynamoDB
ETL
Kubernetes
PySpark
Python
SQL
Benefits
Flexible meal/food allowance (Swile)
Total Pass (gym membership)
Zenklub (online mental health support)
SulAmérica health plan — 100% subsidized for you (Eu.A3)
Amil dental plan
Life insurance
Profit sharing
Annual bonus
Discount on language courses from Open English
AWS Advanced tier partnership
Discount on electricity bills (CEMIG provider)
Citizen company program: extended maternity and paternity leave.