Design scalable and robust AI infrastructure that supports model training, deployment, and management
Update and maintain AI models and infrastructure as technology and requirements evolve
Stay updated with the latest advancements in AI and machine learning technologies
Experiment with new tools and techniques to improve cloud data lake platform and models
Gain deep understanding of data by collaborating with business and advanced analytics partners; demonstrate the understanding to design, curate and publish connected data sets that enables users to self-serve
Lead on ensuring our Azure applications in prod run within SLA and ensure their robustness and security.
Propose technology improvements or come up with innovative solutions to solve business problems.
Participate and propose best solution for AI projects
Collaborate with Architecture, security, and risk teams and implement latest guidelines and Azure standard methodologies
Lead, mentor and guide data engineers to promote customer first approach
Automate infrastructure provisioning and deployment using tools such as terraform
Implement CI/CD pipelines that enables automated code deployment
Participate in Agile sprints and ceremonies; supports rapid iteration and development
Take lead in maintaining the inventory of critical data elements and lineage
Cultivate and maintain strong relationships and fosters collaboration with various teams and partners within the organization
Provide guidance and mentorship to junior engineers and other team members.
Requirements
Bachelor’s in Computer / IT or data related fields required
Minimum 10+ years IT industry experience is required
7 to 10 years of previous platform engineering experience, with experience working in enabling azure related data technologies
1-2 years of experience in model deployments and infrastructure deployments for AI
Solid understanding of Azure infrastructure; subscriptions, resource groups, resources, access control with RBAC (role-based access control), integrations with Azure AD and Azure security principles (user group, service principal, managed identity), network concepts (VNet, Subnet, NSG rules, private endpoints), password\credential\key management and data protection
Strong hands-on knowledge of Azure Databricks, ADF, ADLS, Synapse Serverless/dedicated/spark pools, Python, PySpark, and T-SQL along with experience designing and developing scripts for ETL processes and automation in Azure Data Factory and Azure Databricks
High proficiency in GIT/Jenkins/dev ops processes to maintain and resolve issues with data pipelines in production
Good understanding of data modeling, data mart, data Lakehouse architecture, SCD, data mesh and delta lake overall
Solid understanding of data privacy and compliance regulations and standard methodologies for preserving customer data
Experience deploying LLM models & monitoring AI infrastructure in cloud environments
Knowledge of implementing azure technologies and networking via terraform along with ability to fix issues with Azure infrastructure in production.
Tech Stack
Azure
Cloud
ETL
Jenkins
PySpark
Python
Spark
SQL
Terraform
Benefits
health, dental, mental health, vision, short
and long-term disability, life and AD&D insurance coverage
adoption/surrogacy and wellness benefits
employee/family assistance plans
retirement savings plans (including pension and a global share ownership plan with employer matching contributions)