Valiant Harbor International is seeking a Software Development Engineer III to support the Director’s Office at the Advanced Research Projects Agency for Health. The role involves building agentic AI systems and facilitating LLM application development, while managing features and ensuring reliability in collaboration with internal and external partners.

Responsibilities:

Design and build agentic AI systems and orchestration:
Design and build GRACE's core agentic workflows (e.g., multi-step reasoning, planning, memory, and tool-use across single and multi-agent systems)
Implement and evolve A2A communication patterns at the application layer, enabling GRACE agents to collaborate and hand off tasks
Build and maintain the tool-calling layer (tool definitions, input/output schemas, error handling, retry logic, and result formatting)
Manage the MCP client-side integration
Design multi-agent workflows that are reliable, observable, and debuggable in production
Facilitate LLM application development:
Own LLM orchestration at the application layer (prompt construction, context management, model selection logic, and response parsing)
Build and maintain RAG features (query formulation, result ranking, citation grounding, and hallucination mitigation)
Implement and iterate on prompt engineering patterns and system prompts across OpenAI GPT, Anthropic Claude, and Google Gemini
Manage context window budgets (truncate, summarize, paginate, etc.) and build the logic that makes those decisions correctly
Build evaluation pipelines for LLM quality (grounding assessment, regression testing, safety checks, and A/B experimentation on prompt and model changes)
Manage prompts and pipelines that are cost-efficient without sacrificing output quality
Manage features and products:
Translate ambiguous product requirements into clear technical designs for fast shipment
Build new GRACE capabilities end-to-end (from backend application logic through to the API contract the frontend)
Rapidly prototype new agentic features, run experiments, collect data, and iterate based on real user behavior
Perform oversight and quality assessments; write tests, handle edge cases, and make sure your features degrade gracefully when upstream dependencies fail
Manage reliability and collaboration with internal/external partners:
Instrument agentic workflows with tracing, logging, and metrics so failures are diagnosable and regressions are caught before users report them
Define and monitor application-level SLOs: tool call success rates, response quality, and latency from the user's perspective
Build fallback and guardrail logic for AI services
Write production-quality code: readable, tested, reviewed, and documented
Work closely with the infra engineer to understand system-level constraints and design application behavior that respects them
Participate actively in design reviews, mentor other engineers, communicate technical decision clearly to both engineers and non-engineers

Software Development Engineer III

Key skills

About this role

Responsibilities: