Equinix is the world’s digital infrastructure company, operating 280+ data centers across the globe and providing interconnections to all the key clouds and networks. They are looking for an experienced Senior Staff Engineer to help build and operate a highly scalable, available, and information-rich unified network model.
Responsibilities:
- Actively participate in the design, development, test, and operation of highly reliable services and software to model network entities and relationships that compose Equinix’s global network
- Provide technical guidance and collaborate with stakeholders to identify network data and use cases that can enrich the unified network model to solve critical business problems that increase network reliability, visibility, awareness, and the ability to plan
- Develop solutions that leverage data from the unified network model to provide insights that enhance product capabilities for customers, aid operations teams to troubleshoot customer escalations and planning maintenances, and inform capacity planning teams
- Facilitate cross-stakeholder discussions to ensure alignment on software requirements and design trade-offs, while considering performance, scalability, and reliability factors
- Follow a proactive and collaborative approach to working with cross-functional teams, ensuring seamless integration of unified network model into Equinix Brain with other domains and Network-As-A-Service (NaaS) initiatives
- Lead by example through direct contribution, and provide direction in establishing development and operational best practices and standard methodologies
- Participate in an on-call rotation
Requirements:
- 7+ years of experience developing distributed, scalable, highly available software services using Golang
- Background of working for SaaS, PaaS, IaaS, or cloud-based companies with prior experience of designing microservices and systems at scale with a focus on production readiness
- Experience with building software as a service, running services with 24x7 on-call rotations
- Proficient in data management systems and technologies: GCP Spanner, MongoDB, Redis, Neo4J
- Experience with containerization and orchestration technologies: Docker, Kubernetes, or other open-source alternatives
- Experience in working with network management protocols: gRPC, Netconf
- Solid understanding of networking concepts, protocols (e.g. ISIS, BGP, BMP, LLDP), and their applications
- Strong experience in building & operating highly reliable distributed systems
- Proficient in using continuous integration and continuous deployment technologies: Github Actions, ArgoCD
- Experience with public cloud (AWS, GCP, Azure) services and technologies
- Hands-on experience with observability stack (metrics, logs, traces) such as Grafana, Prometheus, Thanos
- Experience with agile software development practices including JIRA, peer reviews, Git, CI/CD
- Excellent problem-solving and analytical skills to troubleshoot and resolve distributed system issues
- Strong written and verbal communication skills to effectively convey findings, recommendations, and technical details to various stakeholders
- Bachelor's degree in computer science or related technical field
- Master's degree or PhD in Computer Science or a related technical field
- Excellent coding skills in Golang
- Prior experience building a network model or digital twin
- Prior experience using AI to make real-time decisions (on the network)