ClickUp is building the future of work with an AI-native workspace that unifies various tools and technologies. They are seeking a GTM DevOps Engineer to manage the reliability and automation of their Go-To-Market technology stack, ensuring smooth deployment and operation of critical business systems.
Responsibilities:
- Design, build, and maintain CI/CD pipelines for Salesforce (SFDX/Salesforce CLI), NetSuite (SuiteScript/SuiteBundler), MuleSoft (Anypoint Platform), and Workato; establish branching strategies, environment promotion standards, and release gating processes across all GTM platforms
- Extend CI/CD practices to cover AI agent workloads deployed on GCP Cloud Run and AWS Bedrock AgentCore — including containerized builds, deployment pipelines, and automated validation gates
- Implement safe rollout patterns — including feature toggles, phased launches, automated validation, smoke tests, and rollback procedures — to reduce deployment risk on business-critical changes
- Own SLA/SLO definitions for core GTM systems; standardize monitoring, alerting, and runbook patterns across quote-to-cash and GTM integrations, with proactive health checks and synthetic monitoring for critical flows (e.g., Salesforce ↔ NetSuite, Workato)
- Extend observability coverage to GCP Cloud Run workloads — Cloud Scheduler jobs, agent pipelines, and integration microservices — and AWS-hosted agent infrastructure
- Conduct root cause analysis (RCA) for platform incidents and drive post-incident reviews with actionable remediation plans
- Manage sandbox, staging, and production environment lifecycles across GTM platforms — including refresh cycles, data masking, environment segmentation, and promotion standards that balance speed with reliability
- Own cloud infrastructure for Business Systems-operated workloads on GCP (Cloud Run, Cloud Scheduler, Cloud Secret Manager, GCS, Artifact Registry) and AWS (Lambda, S3, EventBridge, Secrets Manager, Bedrock AgentCore); apply IaC practices to make provisioning repeatable and auditable
- Establish base image pinning, dependency vulnerability scanning, and supply chain security practices for containerized workloads — particularly AI-generated codebases deployed via tools like Cursor or Claude Code
- Define and enforce patch management and container runtime ownership for vibe-coded and agentic workloads entering production
- Establish and enforce a consistent secrets management standard across all Business Systems workloads — GCP Secret Manager, AWS Secrets Manager, and equivalent — eliminating credential exposure via environment variables, source code, or client-side contexts
- Define and maintain API key rotation policies in alignment with security standards (high-severity keys: quarterly; vendor keys: annually at minimum)
- Partner with Security and IT on IAM scoping, least-privilege service accounts, VPC configuration, and public/private endpoint governance for Cloud Run and Bedrock deployments
- Maintain a centralized registry of deployed workloads — GitHub repos, deployment URLs, architecture docs, data classification, and observability dashboard links — accessible to AppSec and infrastructure teams
- Build internal tooling, automation scripts, and automated testing frameworks (unit, integration, regression) to reduce toil and increase deployment confidence; continuously evaluate new tooling to improve developer experience
- Develop or enforce GitHub repository templates for Cloud Run deployments that cover security audits, deployment configuration, API integration, and MCP server patterns — serving as a reusable foundation for AI-assisted builds
- Define where self-service deployment and administration are appropriate versus where stronger change control and operational guardrails are required; serve as the DevOps SME, enabling developers to operate with autonomy within those boundaries
- Document and maintain operational runbooks, architecture decision records (ADRs), and deployment standards as living artifacts
- Collaborate with IT, Data Engineering, Security, and business stakeholders on cross-functional initiatives that touch the GTM platform
Requirements:
- 4+ years in a DevOps, Site Reliability Engineering (SRE), or Platform Engineering role
- 2+ years of hands-on experience deploying and managing Salesforce environments (SFDX, Salesforce CLI, scratch orgs, sandboxes, change sets, or pipeline-based deployments)
- Experience with at least one additional GTM/ERP platform, such as NetSuite, MuleSoft, or Workato, in an operational or deployment capacity
- Demonstrated experience building and maintaining CI/CD pipelines using tools such as GitHub Actions, GitLab CI, Jenkins, or Copado
- Hands-on experience with GCP services — Cloud Run, Cloud Scheduler, Cloud Secret Manager, Artifact Registry, GCS — in a deployment or operations context
- Experience with AWS services in an integration or operations context — Lambda, S3, SQS, Secrets Manager, CloudWatch; familiarity with Bedrock AgentCore or similar managed agent hosting infrastructure is a strong plus
- Experience managing containerized workloads, including image lifecycle, dependency scanning, and supply chain security practices
- Experience with incident management, on-call processes, and writing post-incident reviews
- Strong scripting skills in Python, Bash, or Node.js for automation and tooling
- Proficiency with Git, branching strategies, and version control best practices
- Understanding of API-based integration patterns (REST, SOAP, event-driven) as they apply to MuleSoft, Workato, or similar iPaaS tools
- Hands-on familiarity with secrets management across GCP Secret Manager and AWS Secrets Manager; strong grasp of security best practices in CI/CD contexts (least-privilege IAM, no hardcoded credentials, key rotation)
- Monitoring and observability tooling experience (Datadog, Splunk, Google Cloud Monitoring, CloudWatch, or similar)
- Infrastructure-as-code proficiency — Terraform or CloudFormation; experience applying IaC to Cloud Run or Lambda-based workloads preferred
- Familiarity with container security practices: base image pinning, vulnerability scanning (e.g., Artifact Registry, Trivy), and dependency management for AI-generated codebases
- Ownership: Takes full accountability for platform uptime, deployment quality, and the operational health of GTM infrastructure end-to-end
- Velocity: Operates with a bias for action — ships iteratively, reduces toil systematically, and keeps release cycles fast and predictable
- AI & Automation mindset: Actively leverages AI-assisted tooling and automation to accelerate delivery, improve reliability, and reduce manual overhead across the GTM stack
- Collaboration: Fluent cross-functional partner to developers, architects, Security, IT, and business stakeholders — enables autonomy without becoming a bottleneck
- Growth: Continuously evaluates new tooling, patterns, and practices to raise the bar on platform engineering and developer experience