Alpaca is a US-headquartered self-clearing broker-dealer and brokerage infrastructure for various financial services. The Operations Reliability Engineer will work closely with brokerage operations to eliminate manual processes through durable software systems, enhancing efficiency and reliability in operations.
Responsibilities:
- Design, build, test, deploy, and monitor production automations and UIs that remove manual steps and reduce operation time
- Partner with frontend engineers to productize ops tooling so global teams can run functions with predictable staffing
- Execute operational procedures to surface painful manual processes prior to automation
- Instrument and report baseline and outcome metrics (MTTC, manual-steps removed, queue sizes, ops satisfaction) and iterate based on measured impact
- Produce Platform Opportunity Briefs / RFCs for higher-level platform tooling and automations
- Collaborate with licensed BD leadership, Compliance, and Security to build auditable, safe automations with role-based access and clear runbooks
- Own the full lifecycle of the systems you build, including automated deployment (CI/CD with tools like ArgoCD and Terraform), proactive monitoring, On-call support rotations and incident response, following a "you build it, you run it" philosophy
- Build systems with auditability, traceability, and data lineage as a first-class concern to ensure transparency for our auditors and regulators
Requirements:
- 5+ years of professional software engineering experience, with a proven track record of shipping and operating complex, large-scale systems in production
- Strong business sense and understanding of operations
- Deep, hands-on expertise in Golang, including a strong command of its concurrency models (goroutines, channels), memory management, and standard library
- Proven track record of building user-facing features end-to-end with Typescript/React
- Proficient with SQL and relational databases, preferably PostgreSQL
- Demonstrated ability to reason about human workflows as systems, not just software services
- Experience with observability, tracing, continuous profiling
- Exceptional analytical and problem-solving skills, with the ability to deconstruct complex requirements into clear technical components and excellent communication skills for working in a cross-functional environment
- High ownership mindset with bias toward durable, structural fixes over tactical patches
- Knowledge of service oriented architectures
- Experience with major cloud platforms (we primarily use GCP)
- Financial market (exchange, broker-dealers, clearing, etc.) knowledge
- Experience with Docker and Kubernetes
- A passion for financial markets or the desire to learn
- Knowledge of Agile/Scrum methodologies
- Demonstrable experience in designing, building, and reasoning about distributed systems, including a strong understanding of microservices architecture and API design patterns (e.g., REST, gRPC)
- Experience with capacity planning and benchmarking