Hark is an artificial intelligence company focused on building advanced, personalized intelligence systems. They are seeking a Member of Technical Staff, Infrastructure Speech to lead and scale the infrastructure for their real-time speech-to-speech engine, ensuring reliability and performance in low-latency environments.
Responsibilities:
- Facilitate the repeatable, auditable, and scalable provisioning of our speech inference stack
- Harden CI/CD pipelines to guarantee the secure, ultra-low-latency deployment of real-time speech services across all production environments
- Lead the evolution of the end-to-end infrastructure powering Hark's speech-to-speech models, including streaming pipelines, session management, and fault tolerance
- Collaborate with speech ML researchers to identify latency bottlenecks and translate complex requirements into robust infrastructure enhancements
- Oversee system health and incident response, defining critical SLOs for real-time speech workloads where performance and uptime are paramount
- Manage capacity planning, cost efficiency, and the hardware lifecycle for the global speech inference fleet
- Build internal tooling and platform abstractions to streamline the developer experience for teams operating on speech infrastructure