Summary:
Develop analytics and AI solutions to transform raw data into meaningful insights using statistics, machine learning, and visualization software, with a strong focus on LLMs and generative AI in a public-sector context.

Responsibilities

Collect, process, and analyze structured and unstructured data using data mining, modeling, NLP, and ML techniques.
Develop predictive models and algorithms and design automated data pipelines and workflows.
Build dashboards, reports, and visualizations; collaborate with multiple teams to refine requirements.
Implement AI governance and safety guardrails to reduce hallucinations, bias, and security risks (e.g., prompt injection).
Develop LLM evaluation benchmarks with automated metrics and human-in-the-loop feedback.
Identify parameter-efficient fine-tuning (PEFT/LoRA) opportunities for state government datasets.

Requirements

Minimum 3 years of data science experience.
Strong background in statistical analysis and ML; proficiency in SQL, Python, R, or similar.
Experience with ML libraries/frameworks and methods such as regression, clustering, and classification.
2+ years hands-on with models such as GPT-4, Claude, Llama, Gemini or similar, and their APIs.
Expert-level with orchestration tools such as LangChain, LlamaIndex, or Haystack.
Experience with vector databases (Pinecone, Weaviate, Milvus, pgvector) and synthetic / instruction-tuning datasets.
Preferred: Experience in regulated/public-sector environments with PII/PHI and ethical AI standards.

Education

Master’s or PhD in computer science, statistics, mathematics, economics, or related field.
Three years of equivalent related experience may substitute for a Bachelor’s degree.

Applied AI Data Scientist / Prompt Engineer

Key skills

About this role