Design and implement scalable data warehouse and lakehouse architectures on the Cloudera platform
Define enterprise data models, governance frameworks, data stewardship processes, security standards, and data quality practices
Architect and optimize analytics solutions across SQL engines including Impala, Hive, and Iceberg
Design AI-powered analytics solutions leveraging LLMs, Retrieval-Augmented Generation (RAG), vector databases (such as PostgreSQL, Qdrant, Milvus), and NLQ capabilities
Lead the integration of AI/ML capabilities into enterprise data platforms and data pipelines while establishing governance controls for AI models, data usage, and lifecycle management
Leverage vibe coding / AI-assisted development tools to accelerate development and improve productivity
Build and optimize batch and near real-time data pipelines
Collaborate with business stakeholders to translate business requirements into scalable data products and analytics solutions
Establish best practices for performance optimization, data architecture, and AI-assisted development
Mentor teams on modern data architecture and AI-enabled development methodologies
Ensure data security, governance, compliance, and responsible AI practices within enterprise data platforms and AI-enabled solutions
Collaborate with business stakeholders across FP&A, Sales, and Revenue Operations to translate business requirements into scalable data solutions that support financial forecasting, revenue optimization, budgeting, pipeline analysis, and sales forecasting
Requirements
Bachelor’s degree in Computer Science or equivalent and 5-6 years of related experience; OR Master’s degree and 3-5 years of related experience; OR PhD and 0-3 years of related experience
Deep expertise in enterprise data warehousing, lakehouse architectures, and Cloudera-based data platforms
Strong experience with CDP, including HDFS, Hive, Impala, Kudu, and Cloudera data ingestion and processing frameworks
Strong understanding of distributed data systems and Hadoop-based architectures
Advanced SQL skills, including performance tuning and query optimization
Proficiency in Python and data engineering frameworks
Experience with dimensional and normalized data modeling
Strong understanding of data governance, lineage, metadata management, data cataloging, enterprise security, and compliance requirements
Experience implementing AI governance practices including model governance, AI risk management, explainability, monitoring, and responsible AI controls
Experience implementing AI/ML, LLM, vector database, and RAG-based solutions in production environments
Familiarity with AI-assisted development tools (e.g., GitHub Copilot and LLM-powered workflows)
Strong communication, stakeholder management, and problem-solving skills
Ability to align enterprise data architecture with business objectives in Finance, Sales, and Revenue Operations
Ability to bridge traditional data platforms with modern AI capabilities