Design, build, and optimize scalable data processing pipelines using Spark, Airflow, and related big data technologies to support both batch and real-time use cases.
Lead and collaborate with technical, application, and security stakeholders to deliver reliable, secure Big Data infrastructure leveraging tools and platforms such as Spark, Dataproc, Redpanda, and Temporal.
Own and participate in on-call responsibilities for Big Data platforms, triaging and resolving incidents, responding to tickets, and ensuring systems consistently meet defined SLAs for availability, performance, and data quality.
Design, develop, and automate large-scale infrastructure on Kubernetes using Terraform and other infrastructure-as-code patterns to support high‑performance data processing, analytics, and AI/ML workloads.
Define and implement monitoring, alerting, and runbooks to provide end‑to‑end observability and drive continuous improvement in reliability and operational excellence using tools like Grafana.
Onboard, train, and mentor vendor teams and external partners so they can effectively support, operate, and extend the solutions owned by the Big Data Infrastructure (BDI) team.
Drive the technical roadmap for emerging data and infrastructure technologies by evaluating options, building proofs of concept (POCs), and authoring solution selection and design documents.
Champion engineering best practices (code reviews, testing strategies, CI/CD, security and compliance standards) across the Big Data ecosystem to ensure maintainable, resilient, and cost‑efficient systems.
Requirements
7+ years of experience in software engineering or data engineering, with a focus on large-scale data processing and distributed systems.
Hands-on experience building and operating pipelines with Apache Spark (batch and/or streaming) and at least one orchestration framework such as Airflow or Temporal.
Practical experience running Big Data workloads on cloud platforms (for example, using Dataproc or similar managed compute services).
Proficiency with Kubernetes and infrastructure-as-code tools such as Terraform to provision, configure, and manage production services.
Experience with modern streaming and messaging technologies (for example, Redpanda, Kafka, or similar) and real-time data processing patterns.
Solid understanding of systems design, scalability, reliability, and observability for data-intensive workloads.
Experience participating in or leading on-call rotations and incident management processes for production systems.
Strong collaboration and communication skills, including working with cross-functional partners (security, application owners, vendors) and documenting designs and decisions clearly.
Ability to evaluate new technologies, build POCs, and make data-informed recommendations that influence team and platform roadmaps.
Bachelor’s degree in Computer Science, Engineering, Mathematics, or a related technical field, or equivalent practical experience.
Tech Stack
Airflow
Apache
Cloud
Distributed Systems
Grafana
Kafka
Kubernetes
Spark
Terraform
Benefits
Work with talented, collaborative, and friendly people who love what they do.
We host in-person and virtual events such as game nights, happy hours, camping trips, and sports leagues.
Flexible paid time off, paid holidays, options for working from home, and paid parental leave.
Comprehensive benefits package designed to help you be your best self in your personal and professional lives.
Our benefits package offers medical, dental, vision, life and disability, an employee assistance program, voluntary benefits as well as perks programs for your healthy lifestyle, career growth and more.
Our 401K matching plan—1:1 match up to 6% of salary helps you plan ahead.