Mistral AI is a pioneering company focused on leveraging AI to enhance productivity and creativity. The Model Behavior Architect role involves defining and measuring how large language models interact with tools and functions, requiring expertise in model evaluation and creating evaluation pipelines.
Responsibilities:
- Interact with models to identify where function calling and tool use behaviour can be improved
- Gather internal and external feedback on tool-calling behaviour to scope areas for improvement
- Design and implement evals, data guidelines, data generation, and synthetic tool environments and APIs
- Identify and fix edge case behaviours, such as malformed arguments, hallucinated functions, and incorrect tool selection—through rigorous testing
- Develop robust evaluation pipelines for the function-calling capabilities of our model candidates
- Work collaboratively with AI Scientists