Lead and manage a global Hosting Operations team, including infrastructure, cloud, systems, and operations engineers.
Establish team goals, performance objectives, and career development plans.
Conduct performance reviews, coaching, mentoring, and succession planning.
Recruit, develop, and retain high-performing technical staff.
Foster a culture of operational excellence, accountability, continuous improvement, and customer focus.
Own the availability, reliability, performance, and operational health of hosted environments.
Manage infrastructure supporting customer and corporate systems across cloud and on-premises platforms.
Ensure operational readiness through monitoring, alerting, incident response, and problem management processes.
Lead capacity planning, infrastructure forecasting, and resource optimization activities.
Oversee infrastructure lifecycle management, including hardware refreshes, operating system upgrades, and platform modernization initiatives.
Ensure backup, recovery, business continuity, and disaster recovery capabilities are maintained and regularly tested.
Drive automation initiatives that improve operational efficiency and reduce manual effort.
Establish and maintain operational standards, procedures, runbooks, and documentation.
Serve as an escalation point for critical customer issues and service disruptions.
Participate in customer discussions regarding infrastructure, hosting operations, service performance, and operational improvements.
Ensure service level agreements (SLAs) and operational commitments are consistently achieved.
Lead major incident management activities and post-incident reviews.
Communicate operational status, risks, and improvement initiatives to customers and executive leadership.
Oversee operations across AWS and other cloud platforms as required.
Ensure efficient utilization of cloud and data center resources.
Monitor cloud spending and identify cost optimization opportunities.
Collaborate with Engineering teams on platform architecture, scalability, and operational requirements.
Ensure hosting environments comply with company security policies, customer requirements, and regulatory obligations.
Partner with Information Security teams to address vulnerabilities, remediation activities, and security monitoring.
Support audits, compliance assessments, and customer security reviews.
Establish and maintain Hosting Operations policies, procedures, standards, and controls within the Business Management System.
Ensure adherence to Quality Management System (QMS) and Information Security Management System (ISMS) requirements.
Define and maintain Hosting Operations KPIs and service health metrics.
Implement metrics-driven processes to measure service quality, availability, performance, and operational efficiency.
Lead root cause analysis investigations and corrective action programs.
Identify opportunities for operational improvements, automation, and cost reduction.
Provide regular reporting on operational performance, risks, and improvement initiatives.
Requirements
3+ years of people leadership experience, including managing global or geographically distributed technical teams.
Deep experience operating mission-critical production systems with demanding availability, reliability, and performance requirements, preferably in AWS cloud environments.
Demonstrated success leading root cause analysis (RCA) and corrective action programs that improve service reliability and operational effectiveness.
Bachelor's degree in Computer Science, Information Technology, Engineering, or equivalent practical experience.
5+ years of experience managing infrastructure, cloud, hosting, or operations teams.
8+ years of experience supporting enterprise infrastructure environments.
At least 3 years of direct people leadership experience.
Experience managing global or geographically distributed technical teams.
Experience supporting both cloud and on-premises infrastructure environments.
Experience operating mission-critical production systems with high availability requirements.
Experience leading root cause analysis and corrective action initiatives.
Experience with IT service management practices, including incident, problem, change, and capacity management.
Experience working directly with enterprise customers.
AWS certifications or equivalent cloud certifications (preferred).
ITIL certification (preferred).
Experience supporting regulated environments and customer audits (preferred).
SAFe Agile experience (preferred).
Aerospace industry experience (preferred).
Tech Stack
AWS
Cloud
ITSM
Hosting Operations Manager at Flatirons Solutions | JobVerse