AirflowApacheAWSAzureCloudETLOraclePandasPostgresPythonSQLSSISShellData LakeS3Cloud StoragePostgreSQLSQL ServerGitVersion Control
About this role
Role Overview
Responsible for collecting, compiling and analyzing large volumes of unstructured data, interpreting the results and then assisting in the creation of strategies and decision-making to support business growth.
Mine data from primary and secondary sources;
Clean and organize data to remove irrelevant information;
Analyze and interpret results using statistical tools and techniques;
Identify trends and patterns in datasets;
Identify new opportunities for process improvement;
Design, build and maintain databases and data systems;
Troubleshoot code issues and data-related problems;
Support and monitor the client’s data platform.
Requirements
Bachelor’s degree in Information Technology, or completion of any undergraduate degree accompanied by a postgraduate certificate (specialization, master’s or doctorate) in Information Technology of at least 360 hours.
Preferred experience: 2 years in extraction, modeling and organization of large volumes of data for analytical consumption.
Knowledge of advanced SQL, Python (Pandas), Shell Script or equivalent;
ETL experience with Pentaho, Talend, SSIS, Apache NiFi or similar tools;
Databases: Oracle, SQL Server, PostgreSQL;
Cloud storage (Azure Data Lake, AWS S3) and API connections;
Concepts of Data Warehouse, dimensional and normalized data modeling;
Version control (Git), job orchestration/control (Airflow, cron).