Baseten is a company that powers mission-critical inference for leading AI firms, enabling them to bring advanced models into production. The Applied AI Inference Engineer will collaborate with customers to design, build, and deploy high-scale production AI applications, translating business goals into reliable services while ensuring quality and performance outcomes.
Responsibilities:
- Develop and maintain software systems and product features using one or more general-purpose programming languages in a production-level environment, with a preference for Python due to its relevance in ML projects
- Drive customer impact by designing, implementing, and deploying Baseten solutions end-to-end (problem framing → evaluation → production deployment → monitoring). This involves working with customers’ engineering teams at every stage of the customer journey including: sales, implementation, and expansion
- Deliver with velocity: turn vague objectives into clear specs and well-defined PoCs so we can rapidly ship well-tested services and outcomes for our customers
- Optimize and enhance AI/ML projects, contributing to the continuous improvement of our technical stack. This includes developing features and PRDs with other engineering and product orgs
- Own products and customer projects end-to-end, functioning as both an engineer, project manager, and product manager, with a focus on user empathy, project specification, and end-to-end execution
- Navigate ambiguity and exercise good judgment on tradeoffs and tools needed to solve problems, avoiding unnecessary complexity
- Demonstrate pride, ownership, and accountability for your work, expecting the same from your teammates