Design, build, and maintain observability pipelines using the Elastic Stack (Elasticsearch, Kibana, Fleet) across Azure and AWS workloads
Develop and own SLO/SLI dashboards and error budget reporting for BaaS platform services
Respond to and lead incident response for distributed, multi-tenant cloud workloads; own runbook creation, maintenance, and continuous improvement
Build and refine proactive support tooling, including pattern analysis, tenant correlation dashboards, and baseline deviation alerting, to reduce reactive support burden
Manage and maintain Elastic Fleet agent policies, enrollment health, and log streaming pipelines across Azure and AWS worker fleets
Partner with SRE, R&D, and Proactive Support teams to close observability gaps, including tenant identification workflows and admin portal integrations
Requirements
5+ years of experience in cloud platform engineering, SRE, or infrastructure roles supporting commercial SaaS products
Deep hands-on experience with Elastic Stack: Building dashboards, writing KQL/Query DSL, managing Fleet
Proven experience operating and troubleshooting distributed, multi-tenant workloads on Azure and/or AWS
Strong understanding of Azure cloud services: AKS, Entra ID, Key Vault, Service Bus, Cosmos DB, Private Endpoints, etc.
Experience with incident response in production cloud environments, including runbook development and post-incident review
Experience with IaC tools (Azure Bicep, Terraform) and CI/CD pipelines (Azure DevOps, GitHub Actions)
Strong scripting skills in Bash, Python, or PowerShell
Ability to work cross-functionally with SRE, product, and customer-facing support teams
Tech Stack
AWS
Azure
Cloud
ElasticSearch
Python
Terraform
Vault
Benefits
18 paid vacation days, plus 4 extra global VeeaMe Days for self-care and 24 paid volunteer hours annually through Veeam Cares
Private medical coverage for you and up to four dependents
Life, accident, and disability insurance with enhanced coverage
Annual flexible wellbeing allowance for physical and mental wellness
Free confidential counselling and coaching via Employee Assistance Program (EAP), including legal and financial advice
Meal, fuel, and transportation benefits based on work arrangement
Daycare reimbursement and safe cab facility for eligible employees
Opportunities to learn and grow through on-demand libraries (LinkedIn Learning, O’Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning