Amtrak is a leading transportation company that connects businesses and communities across the United States. They are seeking a Principal DevOps Engineer who will be responsible for ensuring the resilience, scalability, and security of digital platforms while driving architectural decisions and mentoring engineers.
Responsibilities:
- Architect progressive delivery (canary/blue-green/feature flags) of DevSecOps CI/CD pipelines
- Automate rollback/fail-forward and release evidence capture
- Standardize quality gates (tests, perf/chaos pre-prod)
- Publish hardened base images and golden IaC modules with guardrails
- Enforce k8s/RBAC, network policies, quotas; secret standards
- Design multi-env promotion workflows with policy checks
- Establish SLOs/error budgets; drive cross-team reliability improvements
- Bake runbooks into alerts; add synthetic/load tests to pipelines
- Lead major incidents; land systemic fixes (not just patches)
- Enforce short-lived creds, zero-trust patterns, and attestation/signing
- Automate compliance checks and evidence collection
- Partner with security on threat-modeling for platform changes
- Create internal libraries/CLIs with telemetry and docs
- Measure automation ROI (time saved, error-rate drop)
- Orchestrate complex workflows (e.g., Step Functions/Argo Workflows)
- Own a platform capability end-to-end (roadmap, SLAs, upgrades)
- Drive adoption of best practices across multiple teams
- Write ADRs and decision logs that clarify trade-offs
- Define/validate RPO/RTO; automate restore drills and reports
- Tune critical paths for latency/throughput and cost
- Forecast impacts of migrations; deliver measurable cost/perf wins
Requirements:
- Bachelor's degree in Computer Science, Engineering, or related technical discipline
- At least 5 years of experience in DevOps, SRE, or Platform Engineering roles with leadership experience in automation and infrastructure reliability
- 3+ years hands-on experience in high-availability production environments with cloud-native security and observability tooling
- Master's degree in Computer Science or equivalent
- Certifications: AWS DevOps Engineer Pro, Terraform Associate, CKA, or SRE-focused credentials
- Experience with developer portals (e.g., Backstage), service mesh (e.g., Istio), and security tooling (e.g., Vault, Falco, Trivy)
- Knowledge of DORA metrics, reliability KPIs, and engineering effectiveness measurement frameworks
- Background in regulated environments (e.g., PCI, HIPAA, FedRAMP) with experience implementing security automation at scale