Senior AI Engineer - Generative AI & Data Platform (AWS)

Hybrid

2-3 days per week onsite at the client s Irvine CA office
1 day per week onsite at the client s Downtown Los Angeles office
1 day remote

Position Overview

We are seeking a highly skilled Senior AI Engineer to lead the design, development, and operationalization of a production-grade Generative AI and Data Platform on AWS. This role will be responsible for building scalable AI solutions that leverage Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), vector search, knowledge graphs, and governed data pipelines.

The ideal candidate will have deep expertise across the complete AI lifecycle, including data ingestion, knowledge engineering, embeddings generation, retrieval systems, backend API development, MLOps, and production deployment. This individual will work closely with product, engineering, and platform teams to enable AI-powered capabilities in customer-facing applications while helping evolve the organization toward agentic AI architectures.

Key Responsibilities

1. Generative AI Platform Development & Integration

Design, build, and operationalize LLM-powered applications using:
- Retrieval-Augmented Generation (RAG)
- Embedding pipelines
- Prompt orchestration frameworks
- Evaluation and experimentation frameworks
Develop and optimize vector search solutions using Amazon OpenSearch.
Design and implement graph-based knowledge systems using Amazon Neptune to support:
- Relationship modeling
- Knowledge lineage
- Explainability
- Knowledge discovery
Integrate supporting AWS services including:
- Amazon ElastiCache (Redis) for caching and session management
- Amazon DynamoDB for low-latency, scalable data access
Build agentic AI workflows using frameworks such as:
- LangGraph
- AutoGen
- CrewAI
- Equivalent agent orchestration frameworks
Implement LLM application frameworks including:
- LangChain
- LlamaIndex
Establish standards for:
- Tool integration
- Context management
- Shared memory patterns
- MCP-style architectures and context-sharing mechanisms
Evaluate and optimize:
- Model performance
- Retrieval effectiveness
- Latency
- Cost efficiency
- Context window utilization

2. Data Engineering & Knowledge Management

Design and develop scalable data pipelines using Databricks and Apache Spark.
Build and maintain:
- Data ingestion pipelines
- Data transformation workflows
- Document processing pipelines
- Metadata enrichment processes
- Embedding generation and indexing workflows
Implement document preparation techniques including:
- Chunking strategies
- Metadata tagging
- Semantic enrichment
Ensure high standards of data quality through:
- Validation frameworks
- Completeness checks
- Consistency monitoring
- Data observability
Implement data governance controls including:
- Data classification
- Access management
- Retention policies
- Auditability
- Lineage tracking

3. Backend Services & API Engineering

Design and develop scalable backend services exposing AI platform capabilities.
Build secure, reusable APIs and microservices for enterprise applications.
Establish best practices for:
- API design
- Versioning
- Reliability
- Retry mechanisms
- Circuit breakers
- Idempotent operations
Enable platform reusability across multiple teams and business applications.

4. MLOps, Deployment & Operational Excellence

Design and maintain CI/CD pipelines for AI, ML, and data workloads.
Deploy and manage production systems using:
- Docker
- Kubernetes
Implement deployment strategies including:
- Blue-Green Deployments
- Canary Releases
- Rollback Mechanisms
- Feature Flagging
Ensure platform reliability through:
- Monitoring
- Logging
- Alerting
- Observability
- Cost tracking
- Data freshness monitoring
Implement:
- Secrets management
- Role-based access controls
- Least-privilege security practices
Continuously optimize platform performance, scalability, and cost.

5. LLM Evaluation, Observability & Quality Engineering

Define and measure AI quality metrics including:
- Grounding/Faithfulness
- Retrieval relevance
- Response consistency
- Hallucination rates
- Latency
- Cost per request
Build and maintain:
- Prompt versioning frameworks
- Offline evaluation pipelines
- Automated testing processes
- Continuous improvement workflows
Drive AI quality improvements through experimentation and monitoring.

6. AI Security, Governance & Compliance

Implement secure AI solutions with:
- Authentication
- Authorization
- Access controls
- Data protection mechanisms
Establish responsible AI guardrails.
Ensure compliance with organizational and industry standards related to:
- AI safety
- Privacy
- Governance
- Monitoring
- Auditability

Required Qualifications

Education

Bachelor s or Master s degree in:

Computer Science
Data Science
Artificial Intelligence
Machine Learning
Related technical discipline

Required Technical Skills

<>Generative AI & LLMs</>

Strong hands-on experience building production-grade Generative AI solutions.
Expertise in:
- Retrieval-Augmented Generation (RAG)
- Embeddings
- Prompt engineering
- Retrieval optimization

<>AWS Cloud</>

Hands-on expertise with:

Amazon OpenSearch (Vector Search)
Amazon Neptune
Amazon DynamoDB
Amazon ElastiCache (Redis)

<>LLM Frameworks</>

Experience with:

LangChain
LlamaIndex

<>Agentic AI Frameworks</>

Hands-on experience with:

LangGraph
AutoGen
CrewAI
Similar agent orchestration frameworks

<>Data Engineering</>

Strong experience with:

Databricks
Apache Spark
Large-scale data pipelines
Embedding pipelines

<>Backend Engineering</>

Strong Python development experience.
Experience building scalable APIs and microservices.
Strong understanding of distributed systems and service-oriented architectures.

<>Platform Engineering</>

Experience with:

CI/CD pipelines
Docker
Kubernetes
Production AI deployments

Preferred Qualifications

Experience with AI evaluation and observability platforms.
Experience implementing AI governance and compliance frameworks.
Advanced Kubernetes and MLOps experience.
Familiarity with:
- Model Context Protocol (MCP)
- Agent-based architectures
- Multi-agent systems
- Knowledge graph ecosystems

Domain Experience

Preferred experience in one or more of the following:

AI/ML Platform Engineering
Generative AI Applications
Enterprise AI Platforms
Data Platforms & Big Data Engineering
Knowledge Management Systems

Certifications (Preferred)

One or more AWS certifications:

AWS Certified Solutions Architect
AWS Certified Machine Learning - Specialty
AWS Certified Data Engineer

Soft Skills

Strong analytical and problem-solving abilities.
Excellent communication and stakeholder management skills.
Ability to explain complex AI concepts to technical and non-technical audiences.
Collaborative and cross-functional mindset.
Strong ownership mentality with proactive execution.
Ability to thrive in fast-paced, evolving environments.

Mandatory Skills Checklist

Candidates must demonstrate hands-on production experience in:

Generative AI / LLMs (RAG, Embeddings, Prompt Engineering)

AWS Cloud Services (OpenSearch, Neptune, DynamoDB, Redis/ElastiCache)

Vector Search & Retrieval Systems

Knowledge Graphs / Graph Databases (Amazon Neptune)

LangChain and/or LlamaIndex

Agentic AI Frameworks (LangGraph, AutoGen, CrewAI)

Databricks & Apache Spark

Python Backend Development & API Engineering

Production Deployment using Docker and Kubernetes

AI Platform Architecture and End-to-End Delivery

Specialist II - Data Science

Key skills

About this role