Service Performance & Resilience Manager
Job Description
We're seeking a Service Performance & Resilience Manager to take ownership of performance, capacity, and resilience across critical IT services. This role focuses on keeping customer-facing services fast, reliable, and fully observable, while driving continuous improvement.
You will lead observability across services, ensuring effective monitoring and actionable insights. You'll manage capacity and performance through forecasting and trend analysis, identifying risks early and driving improvements. Ensure resilience and availability are built into services from the outset, while supporting continuity planning and risk management. Working closely with technical teams and stakeholders, you'll help resolve issues and deliver ongoing service improvements.
Key Requirements
Experience managing capacity and performance in IT environments
Hands-on experience with AWS and Azure
Strong knowledge of ITIL v3/v4 (certification required)
Experience with monitoring/observability tools (e.g. Zabbix, Grafana, Kibana, OpenSearch)
Knowledge of Windows and Linux server environments
Scripting skills (e.g. Python, PowerShell, Node.js)
Experience integrating data via APIs, webhooks, or messaging
Strong analytical, problem-solving, and stakeholder management skillsDesirable:
DevOps exposure
Network infrastructure and communications protocols knowledge
Experience with social alarm platforms
If you're looking for a role where you can make a tangible impact on service performance and resilience, we encourage you to apply.
Spectrum IT Recruitment (South) Limited is acting as an Employment Agency in relation to this vacancy
Job Summary
Similar Jobs
The largest community on the web to find and list jobs that aren't restricted by commutes or a specific location.
-
Graduate Network Engineer
- IT Job Pro
-
Data Coordinator
- IT Job Pro
-
Strategic Customer Success Manager
- IT Job Pro

