Brooksource's Fortune 50 healthcare client is seeking a modern Lead Data Engineer to provide technical leadership and delivery oversight across multiple AI data products within their enterprise AI Hub. This role focuses on technical direction, architectural guidance, and team leadership while remaining hands-on in building scalable data pipelines and AI-enabling data assets.
Responsibilities:
- Provide technical leadership across multiple AI Data Product initiatives and engineering workstreams
- Understand and clarify technical requirements, recommend architecture/design elements, and set overall technical direction across projects
- Design, implement, and maintain scalable ETL/ELT pipelines and distributed data workflows using Databricks/Spark technologies
- Implement and optimize CI/CD pipelines, data operations workflows, and cost management strategies across the data platform
- Build and support AI-enabling data assets such as vector stores, feature tables, Genie Rooms, and semantic AI context assets, while ensuring integration into model development workflows
- Partner with AI/ML, analytics, platform, and business teams to deliver production-grade data solutions
- Support platform visibility by delivering operational insights into platform utilization, cost trends, and financial operations
- Oversee and support Junior-Senior Engineers through POCs, technical guidance, troubleshooting, and code reviews
Requirements:
- Strong hands-on experience with Databricks Data Engineering and Spark distributed computing
- PySpark and Python expertise for large-scale data processing
- Strong SQL skills and experience with data warehouses and data analysis
- Hands-on experience building data pipelines (batch and streaming)
- Experience working with columnar data formats (Parquet, Delta)
- Experience with DevOps practices, CI/CD pipeline development, and Git workflows (GitHub/GitLab)
- Familiarity with Linux scripting fundamentals (for pipeline and CI/CD automation)
- Exposure to emerging AI data infrastructure, such as building vector stores and applying DataOps / MLOps practices
- Technical leadership across multiple concurrent projects, providing architectural guidance, defining technical work, and setting technical direction