ApacheAWSCloudDockerDynamoDBEC2LinuxPostgresSQLTerraformBashAIMLLarge Language ModelsClaudeAnthropicAgenticAnalyticsGitHub ActionsGitLab CIECSFargateCloudFormationAWS CDKLambdaS3RDSIAMCloudWatchSNSSQSSageMakerBedrockAPI GatewayCodePipelinePostgreSQLSQL ServerGitHubGitLabService MeshCI/CDChange ManagementFirewall
About this role
Role Overview
AWS Infrastructure Engineering: Design, deploy, and manage AWS infrastructure supporting production financial workloads.
Work with services including EC2, ECS/Fargate, RDS, Aurora, S3, VPC, Lambda, CloudWatch, Secrets Manager, IAM, DynamoDB, SQS, SNS, EventBridge, API Gateway, AWS IAM Identity Center, and QuickSight.
Implement and maintain Infrastructure as Code using Terraform, CloudFormation, or AWS CDK.
Manage infrastructure across multiple AWS accounts and environments.
Build and optimise CI/CD pipelines using tools such as GitLab CI, GitHub Actions, AWS CodePipeline, or equivalent.
Design and maintain secure networking, including VPCs, subnets, routing, security groups, NACLs, Site-to-Site VPN, Transit Gateway, and related controls.
Implement monitoring, alerting, logging, and observability using CloudWatch, dashboards, metrics, alarms, and log aggregation.
Execute cost optimisation initiatives across compute, storage, databases, data transfer, and managed services.
Build and maintain AWS QuickSight dashboards for billing analytics, cost allocation, usage trends, and financial reporting.
Support recurring FinOps activities including monthly spend reviews, rightsizing, Savings Plans / Reserved Instance analysis, and waste elimination.
Database Administration: Administer PostgreSQL environments, including RDS for PostgreSQL, Aurora PostgreSQL, and self-managed PostgreSQL where applicable.
Support PostgreSQL version lifecycle management, including supported production versions and upgrade planning.
Manage replication, backup and recovery, point-in-time recovery, vacuuming, indexing, query tuning, and performance troubleshooting.
Manage Amazon Aurora PostgreSQL clusters, including scaling, failover, parameter groups, monitoring, and Performance Insights.
Administer SQL Server on Amazon RDS, including backup strategies, index maintenance, Query Store analysis, and parameter tuning.
Plan and execute database migrations, including SQL Server to PostgreSQL migrations using AWS DMS and native database tooling.
Implement database security controls, including encryption at rest and in transit, IAM authentication where appropriate, audit logging, access control, and secrets management.
Linux Systems Administration: Manage Amazon Linux 2023 and RHEL-based systems.
Perform patching, hardening, performance tuning, log management, and operational troubleshooting.
Administer Apache HTTP Server, including virtual hosts, SSL/TLS, module configuration, and runtime troubleshooting.
Write and maintain Bash scripts for automation, monitoring, deployment, and operational tasks.
Implement host-level security controls including SSH hardening, firewall rules, least-privilege access, log forwarding, and vulnerability remediation.
AI-Assisted Engineering and Automation: Use approved AI-assisted engineering tools to improve operational workflows, documentation, code review, and troubleshooting.
Work with Amazon Bedrock and supported large language models, including Anthropic Claude models where approved for company use.
Use Kiro, AWS's agentic coding service, where appropriate to support spec-driven development, documentation, testing, and implementation planning.
Evaluate AI-assisted tooling for infrastructure operations, anomaly detection, documentation generation, and incident response support.
Build automation that improves alert enrichment, operational insight, and repeatable engineering workflows.
Ensure AI usage follows company security, data protection, privacy, and compliance policies.
Application Modernisation and Rearchitecture: Contribute to the rearchitecture of monolithic transactional applications into modern, resilient AWS-native patterns.
Support containerisation strategies using Docker, ECS, and Fargate.
Design and implement event-driven patterns using SQS, SNS, and EventBridge.
Support blue/green and canary deployment strategies for safer releases and reduced downtime.
Improve application reliability, scalability, observability, and operational maintainability.
Documentation, Security, and Compliance: Produce clear technical documentation for completed work, including architecture decisions, runbooks, migration plans, configuration records, and operational procedures.
Maintain standard operating procedures and incident response runbooks.
Support ISO 27001 audit activity by providing evidence of infrastructure controls, change history, access controls, monitoring, and operational procedures.
Contribute to change management processes, including risk assessment, implementation planning, rollback planning, and CAB submissions.
Apply least-privilege, secure-by-design, and auditability principles across all infrastructure and operational work.
Requirements
5+ years in DevOps, SRE, Cloud Engineering, or Infrastructure Engineering, including at least 3 years of hands-on AWS experience.
Deep hands-on experience with EC2, RDS, Aurora, ECS/Fargate, S3, VPC, IAM, Lambda, CloudWatch, Secrets Manager, and QuickSight.
Strong Terraform, CloudFormation, or AWS CDK experience.
Experience building and maintaining deployment pipelines using GitLab CI, GitHub Actions, AWS CodePipeline, or equivalent.
Strong PostgreSQL administration experience, including replication, performance tuning, backup and recovery, upgrades, and pg_dump / pg_restore.
Working knowledge of SQL Server on RDS, including query optimisation, index management, backups, and maintenance tasks.
Advanced Linux administration experience, preferably with Amazon Linux and RHEL-based distributions.
Strong Bash scripting and automation skills.
Practical experience with VPCs, routing, subnets, security groups, NACLs, VPNs, and Transit Gateway.
Strong understanding of IAM, encryption, secrets management, patching, access control, and least-privilege design.
Practical interest or experience in AI-assisted engineering, Amazon Bedrock, Kiro, LLMs, prompt engineering, or AI-supported development workflows.
Ability to produce clear, complete, audit-ready technical documentation.
Excellent written and spoken English, with the ability to explain complex technical topics to varied audiences.
Desirable skills: Experience in regulated financial services, payments, banking, or similar environments. ISO 27001, SOC 2, PCI DSS, or other audit / compliance exposure. AWS Solutions Architect Professional, AWS DevOps Engineer Professional, AWS Database Specialty, or equivalent. Cross-engine migration experience, especially SQL Server to PostgreSQL using AWS DMS or native tooling. Direct Connect, Transit Gateway, Site-to-Site VPN, IPAM, and multi-account networking. AWS cost optimisation, CUR analysis, QuickSight billing dashboards, Savings Plans, Reserved Instances, and tagging strategies. Docker, ECS task definitions, Fargate, service discovery, and service mesh concepts. Amazon Bedrock, Kiro, SageMaker, Claude models, AI-assisted development, or AI-enabled operational automation. Experience with log aggregation, distributed tracing, SLOs, dashboards, and incident response workflows.
Tech Stack
Apache
AWS
Cloud
Docker
DynamoDB
EC2
Linux
Postgres
SQL
Terraform
Benefits
Competitive base salary, commensurate with experience.
Annual performance bonus.
Pension / retirement contribution.
Health insurance.
25 days annual leave plus public holidays.
Company-supported training and upskilling, including AWS certification preparation, AI/ML workshops, technology days, and conference attendance.
Learning and development budget for certifications, conferences, and training.