3M is a global company that values innovation and collaboration among its employees. They are seeking a Lead Data Engineer who will design, develop, and optimize data platforms to support advanced analytics and machine learning applications across the organization.
Responsibilities:
- Lead the architecture and development of scalable, secure data pipelines supporting AI/ML workloads
- Own end to end data engineering processes: ingestion, transformation, storage, quality, and monitoring
- Collaborate with data scientists and ML engineers on model features, training pipelines, and deployment
- Drive best practices in data modeling, orchestration, versioning, and performance optimization
- Mentor and guide junior engineers; contribute to technical roadmaps and solution patterns
- Ensure data governance, lineage, and compliance standards are met across platforms
- Support real time and batch processing frameworks in production environments
Requirements:
- Bachelor's degree or higher in computer science (completed and verified prior to start) from an accredited institution
- Seven (7) or more years of data engineering experience, including leading technical initiatives
- Strong expertise in Python, SQL, and distributed data systems (e.g., Spark, Databricks, Synapse)
- Experience building AI/ML ready data pipelines, including feature stores and model training data flows
- Hands on experience with cloud platforms (Azure preferred — Data Lake, Data Factory, Databricks, Cosmos DB)
- Strong understanding of ML concepts, lifecycle, and MLOps practices
- Proven experience with workflow orchestration (Airflow, Data Factory, Synapse Pipelines, etc.)
- Strong communication and ability to translate business needs into technical solutions
- Experience with streaming platforms (Kafka, Event Hub)
- Familiarity with vector databases, embeddings, or LLM oriented data pipelines
- Background in DevOps, CI/CD, or infrastructure as code
- Experience designing data solutions for AI applications or intelligent platforms