Soho Square Solutions is seeking a Senior Data Engineer to design and optimize data pipelines using Databricks and other technologies. The role involves building scalable datasets for analytics and AI applications, managing data governance, and collaborating with teams to deliver data solutions.
Responsibilities:
- Design and optimize Databricks Medallion Architecture (Bronze, Silver, Gold) pipelines using Delta Lake, Spark, Python, and SQL
- Ingest, process, and transform structured and unstructured data from enterprise systems
- Build scalable, AI-ready datasets and data models for analytics and LLM-powered applications
- Develop data quality, validation, and monitoring frameworks
- Support RAG pipelines, embeddings, and data preparation for AI agents
- Manage Unity Catalog governance, access controls, and schema management
- Collaborate with business and technical teams to deliver scalable data solutions
Requirements:
- 7+ years of experience with Databricks, Spark, Delta Lake, Python, and SQL
- Strong experience implementing Medallion Architecture at scale
- Expertise in processing unstructured data (documents, PDFs, text, images)
- Experience supporting AI/ML, Generative AI, or LLM-based solutions
- Knowledge of embeddings, RAG, and AI agent workflows
- Strong problem-solving and stakeholder management skills