Optum is a global organization that delivers care aided by technology to help millions of people live healthier lives. In this role, you will design, develop, test, and deploy data pipelines and architectures to support Advisory Board's data and analytics projects, ensuring data quality and collaborating with data scientists to optimize analytics infrastructure.

Responsibilities:

Pipeline Development: Design, build, and maintain scalable, reliable, and efficient ETL/ELT pipelines using AWS Glue, Python, and SQL. Automate manual processes and optimize PySpark jobs for big data
Data Lake/Warehouse Management: Architect and manage data lakes (AWS S3), data warehouses (Redshift, S3 Tables) and relational databases (PostgreSQL, SQL Server). Use data modeling best practices to ensure data is accurate, accessible, and organized for efficient reporting and analysis
Cloud Infrastructure: Utilize AWS services like ECS and Lambda for data ingestion and orchestration tasks, involving a variety of external systems (Kafka, Snowflake, Databricks, custom API, etc.)
Data Quality & Security: Implement monitoring and troubleshooting measures to ensure data integrity and security, including IAM policies and CloudWatch logging
Collaboration: Work closely with data scientists and analysts to understand healthcare data and business requirements, and support data-driven decision-making
Analytics Development: Build data sets and dashboards, configure RLS and user permissions, and optimize dashboard performance on Amazon Quick Suite. Configure parameters for interop with web app embedding
CI/CD: Leverage GitHub for version control, code review, and automated deployment of pipelines across environments

Requirements:

8+ years of solid experience in data engineering using Python or PySpark
5+ years of experience with AWS including Glue, Lambda, IAM, Redshift
3+ years developing and optimizing PySpark jobs with proven big data experience
3+ years of best practices in data analysis and modeling
Experience optimizing the architecture for big data, query performance, ease of use, and data governance
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
Experience working with Cloud technologies (AWS preferred)
Proven solid expertise in SQL and data modelling with relational databases like SQL Server, Postgres etc
Experience in Amazon Quick Suite developing datasets and dashboards with support for web app embeddings
Understanding of health care claims data, including Medicare and commercial datasets
Proven eagerness and willingness to learn new technologies
Proven solid analytical, problem solving and decision-making skill
Demonstrated depth of health care knowledge and expertise
Proven written and oral communication skills

Data Engineer - Optum Advisory Board - Remote

Key skills

About this role

Responsibilities:

Requirements: