Design, develop, and maintain robust ETL/ELT pipelines to ingest, transform, and deliver data across enterprise platforms
Build scalable data ingestion frameworks for structured and semi-structured data, including XBRL filings and financial datasets
Implement data transformation logic to support analytics, reporting, and regulatory use cases
Ensure data pipelines are reliable, performant, and scalable in cloud environments
Leverage AI-assisted development tools to accelerate pipeline development, testing, and optimization
Develop and manage data solutions leveraging AWS services (e.g., S3, Airflow, DAGs, Glue, Lambda, Redshift)
Implement and optimize Apache Iceberg table formats for large-scale, ACID-compliant data lakes
Support lakehouse architectures that unify data lakes and data warehouses
Optimize data storage and retrieval strategies for performance and cost efficiency
Design and implement CI/CD pipelines for data pipelines, infrastructure, and analytics code using tools such as GitHub Actions, GitLab CI, Jenkins, or AWS-native services
Integrate AI-driven testing and monitoring tools to improve pipeline quality and reduce operational risk
Ensure alignment with data governance frameworks and standards established by OCDO organizations, including AI data readiness and traceability
Collaborate with data architects, analysts, and business stakeholders to understand data needs and deliver solutions.
Requirements
Bachelor’s degree in Computer Science, Engineering, Data Science, or related field
5+ years of experience in data engineering, ETL development, or data platform engineering
Strong hands-on experience with ETL/ELT tools and frameworks
AWS data services (S3, Glue, Lambda, Redshift, etc.)
Apache Iceberg and modern data lake architectures
Experience designing and implementing CI/CD pipelines for data platforms and ETL workflows
Demonstrated proficiency using AI tools and AI-assisted development workflows (e.g., LLM copilots, automated code generation, pipeline optimization tools)
Experience processing XBRL or complex financial/regulatory datasets
Proficiency in SQL and Python
Experience implementing materialized views and query optimization techniques
Understanding of data modeling concepts and metadata management
Familiarity with data governance, data quality practices, and data readiness for AI/ML use cases
Ability to work in Agile, DevOps-oriented environments
U.S. Citizenship required; ability to obtain and maintain a federal clearance.