Role Overview

Define AI feature specifications upfront — including acceptance criteria, evaluation metrics, prompt contracts, and expected behaviors — and champion this spec-driven approach across the team.
Own end-to-end AI feature delivery across the full AI SDLC: spec definition, prototyping, development, evaluation, deployment, and production monitoring.
Build production-grade LLM and agentic AI applications using Spring AI — including RAG pipelines, agent orchestration, tool-use patterns, guardrails, and human-in-the-loop workflows.
Architect and operate AWS AI infrastructure (Bedrock, Bedrock Agents, Agent Core, SageMaker) alongside core AWS services (ECS/EKS, Lambda, S3, DynamoDB, RDS, API Gateway).
Design and implement scalable microservices and distributed systems in Java, TypeScript, and Python that power the Archer AI platform.
Build CI/CD pipelines for AI workloads — including LLM evaluation pipelines and automated regression testing for AI outputs — using Terraform, CloudFormation, Docker, Kubernetes, and GitHub Actions.
Drive AI-specific operational practices: observability, drift detection, quality scoring, feedback loops, and incident response for non-deterministic systems.
Communicate technical concepts clearly to both technical and non-technical stakeholders; author AI specs, design documents, and architectural decision records.
Mentor engineers, conduct thorough code reviews, and champion engineering excellence.

Requirements

Spec-Driven AI SDLC: Deep expertise in the AI software development lifecycle with a specification-first mindset.
Experience authoring AI feature specs (acceptance criteria, evaluation metrics, prompt contracts) and driving the full lifecycle from prototyping through evaluation frameworks, A/B testing, deployment of non-deterministic systems, and production monitoring (drift detection, quality scoring, feedback loops).
Track record of shipping AI-powered features through multiple product cycles with engineering rigor.
AWS AI Infrastructure: Strong hands-on experience with Amazon Bedrock, Bedrock Agents, Agent Core, SageMaker, and Amazon Q.
Solid knowledge of core AWS infrastructure including compute (ECS/EKS, Lambda), databases (RDS, DynamoDB, ElastiCache), networking (VPC, ALB, CloudFront), and security (IAM, KMS, Secrets Manager).
Experience architecting AI infrastructure pipelines with cost optimization and high availability.
LLM Frameworks & Agentic AI: Hands-on experience building production applications with Spring AI.
Solid understanding of LLM application patterns (prompt management, RAG, context orchestration, vector stores, evaluation) and agentic workflows (multi-step agents, tool-use orchestration, planning loops).
Java, TypeScript & Python: 5+ years of professional software engineering with strong proficiency across all three languages — Java (Spring Boot, Spring Cloud), TypeScript (Node.js, modern frameworks), and Python (AI tooling, evaluation frameworks).
Comfortable choosing the right language for each task.
Enterprise & Large-Scale Systems: Experience designing and operating distributed systems at scale.
Familiarity with event-driven architectures, message brokers (Kafka, SQS/SNS), caching (Redis, ElastiCache), and relational/NoSQL database design.
DevOps & Infrastructure: Proficiency in CI/CD pipelines, Infrastructure as Code (Terraform, CloudFormation), containerization (Docker, Kubernetes/EKS), and GitOps workflows.
Problem Solving & Communication: Excellent analytical skills and the ability to tackle complex, ambiguous challenges independently. Outstanding written and verbal communication — able to articulate technical concepts to diverse audiences and collaborate effectively across teams.
Education: Bachelor’s or Master’s degree in Computer Science, Software Engineering, or a related field (or equivalent practical experience).

Tech Stack

AWS
Cloud
Distributed Systems
Docker
DynamoDB
Java
JavaScript
Kafka
Kubernetes
Microservices
Node.js
NoSQL
Python
Redis
SDLC
Spring
Spring Boot
SpringBoot
Terraform
TypeScript

Benefits

Competitive Medical, Dental, and Vision insurance
401K matching program
Flexible Time Off
11 paid holidays
1 volunteer day per year

Senior AI Platform Engineer

Key skills

About this role

Role Overview

Requirements

Tech Stack

Benefits