Job Vacancy Site Reliability Engineer

London

CGI

Job position Site Reliability Engineer

Permanent

As soon as possible

London, England, United Kingdom

Published on 11/05/2026

Share this job

We are seeking an experienced and proactive Site Reliability Engineer (SRE) to join a team supporting multiple data product and platform groups. This role is focused on improving the reliability, scalability, observability, and operational performance of critical data-driven platforms and services across complex production environments.

The successful candidate will work closely with engineering, platform, and support teams to strengthen monitoring and alerting capabilities, improve logging and traceability, troubleshoot production incidents, support deployments, and automate operational processes wherever possible. The environment includes Kubernetes, Helm, the ELK stack, and a strong focus on modern Site Reliability Engineering practices across cloud and platform services.

This is a hands-on technical role suited to someone who thrives in fast-paced operational environments and is passionate about reliability engineering, automation, and continuous improvement. The role requires strong collaboration with both client stakeholders and engineering teams to ensure platform stability, operational excellence, and high service availability

Candidate profile

- Support, maintain, and improve highly available production platforms and services across cloud and containerised environments.

- Manage and support Kubernetes clusters and Helm-based deployments across multiple environments.

- Implement and enhance monitoring, alerting, logging, and observability solutions to improve platform reliability and operational visibility.

- Investigate incidents, analyse logs, identify root causes, and drive timely resolution of production issues.

- Participate in incident response, post-incident reviews, and continuous operational improvement initiatives.

- Automate operational tasks and repetitive support activities to reduce manual effort and improve platform efficiency.

- Work closely with engineering and data platform teams to improve system resilience, scalability, deployment reliability, and operational maturity.

- Develop and maintain operational documentation, support procedures, runbooks, and troubleshooting guides.

- Contribute to reliability engineering practices including proactive monitoring, service health management, and operational readiness.

- Support deployment activities, release processes, and production change management activities.

Required qualifications to be successful in this role

- Strong commercial experience in Site Reliability Engineering, Platform Engineering, DevOps, or Production Support environments.

- Strong hands-on experience with Kubernetes and Helm in enterprise or production environments.

- Proven experience supporting mission-critical production platforms and operational support functions.

- Strong hands-on experience with the ELK stack (Elasticsearch, Logstash, Kibana) for logging, monitoring, troubleshooting, and operational analysis.

- Demonstrated capability in log analysis, incident investigation, troubleshooting, and root cause analysis.

- Strong understanding and practical experience with core SRE practices including:

Monitoring and alerting

Incident management and response

Root cause analysis and post-incident reviews

Automation and operational improvement

Production support and reliability engineering

-Experience working with data platforms, analytics platforms, or data product teams would be highly advantageous.

- Experience with scripting and automation tools such as Bash, Python, or similar technologies is desirable.

- Exposure to CI/CD pipelines, Infrastructure as Code, and cloud-native environments would be beneficial.

- Strong communication, stakeholder engagement, and collaboration skills.

- Ability to work effectively in fast-paced support environments and manage competing priorities under pressure.

Security Clearance

- Resource must be willing and able to work onsite at the client location five days per week.

- Candidate must already hold current HLC clearance (mandatory requirement).

- Previous experience working within secure, government, defence, or highly regulated environments will be highly regarded.

- Due to client security requirements, only candidates meeting the required clearance criteria will be considered.

#LI-CGISDI

Working environment

Together, as owners, let’s turn meaningful insights into action.

Life at CGI is rooted in ownership, teamwork, respect and belonging. Here, you’ll reach your full potential because…

You are invited to be an owner from day 1 as we work together to bring our Dream to life. That’s why we call ourselves CGI Partners rather than employees. We benefit from our collective success and actively shape our company’s strategy and direction.

Your work creates value. You’ll develop innovative solutions and build relationships with teammates and clients while accessing global capabilities to scale your ideas, embrace new opportunities, and benefit from expansive industry and technology expertise.

You’ll shape your career by joining a company built to grow and last. You’ll be supported by leaders who care about your health and well-being and provide you with opportunities to deepen your skills and broaden your horizons.

Come join our team—one of the largest IT and business consulting services firms in the world.

Discover CGI

London, England, United Kingdom

> 1000 employees

IT services

We are a global business with 91,500 professionals across hundreds of locations worldwide, providing end-to-end IT and business process services that drive our clients’ businesses. In the UK alone, we have around 6,000 member-partners working across 19 towns and cities, doing everything from writing complex code to designing sophisticated software solutions. The work we do powers some of the most ambitious projects that touch the lives of many. Our approach sets us apart, characterised by our proximity model, international presence, expertise, and operational excellence. We organise our operations within metro markets where our clients have concentrated footprints, empowering our local teams to build trusted, in-person relationships. As you build your career here, you’ll develop your skills, share your insights, and work side-by-side with colleagues and clients on innovative solutions that address the most complex issues. Our commitment to ownership, flexibility, and being approachable makes us easy to do business with. Join us, and together, as owners, let’s turn meaningful insights into action to meet the needs and balance the interests of our clients, shareholders, communities, and each other. Benefits: - Insurance coverage - Medical benefits - Pension plan - Member Assistant Programme - Check4Cancer - Flexible time off - Share Purchase Plan - Member discounts - Dental benefits - Vision benefits - Profit Participation Plan - Health and Wellbeing Programme

Apply to this job!

Find your next job from +700 jobs!

Manage your visibility

Salary, remote work... Define all the criteria that are important to you.
Get discovered

Recruiters come directly to look for their future hires in our CV library.
Join a community

Connect with like-minded tech and IT professionals on a daily basis through our forum.

Site Reliability Engineer

CGI