Job position Site Reliability Engineer - CO - G7
Share this job
As a DevOps Engineer on the GovWifi service, you will be part of a cross-disciplinary team responsible for ensuring the secure, reliable, and efficient operation of a government-critical platform. Your work will directly support thousands of users across the UK public sector, helping create a seamless and secure WiFi experience in government buildings nationwide.
What you’ll be doing:
Maintaining service reliability: Monitor, manage and improve the availability of GovWifi, ensuring the platform consistently meets service level objectives. Respond to and resolve incidents quickly, serving as a point of escalation when needed.
Automating infrastructure: Use Terraform (or other IaC tools) to automate deployments and infrastructure changes, reducing manual intervention and improving consistency.
Deploying securely: Carry out safe, reliable deployments of code and configuration into AWS environments (ECS, EC2, CloudWatch, ELB, CodeBuild, CodePipeline).
Improving system resilience: Design, build and implement monitoring, alerting, and recovery mechanisms to keep systems highly available and secure.
Mitigating risks: Identify, assess, and reduce security vulnerabilities across the platform, applying web security best practices and implementing protective measures.
Supporting migrations and transitions: Assist with tool changes, platform improvements, or policy-driven migrations that affect GovWifi operations.
Building for users: Develop new features or improvements through prototyping, proof-of-concepts, and continuous iteration in collaboration with product managers and developers.
Knowledge sharing: Document technical decisions clearly, add to the team’s knowledge base, and explain complex issues to non-technical colleagues in a clear, supportive way.
Customer support: Engage with end-user requests and issues through support tools such as Zendesk, helping resolve technical challenges directly impacting users.
Driving continuous improvement: Pair with teammates, contribute to engineering improvement initiatives, and promote best practices across the service.
Ways of working:
You’ll spend your time collaborating closely with site reliability engineers, developers, product managers, and central teams. You’ll work independently when needed, but also in pairs and group settings to solve problems. You’ll play an active role in incident reviews, retrospectives, and roadmap planning. The role requires curiosity, adaptability, and a commitment to secure, user-centred service delivery.
Candidate profile
Essential Criteria
Strong technical expertise in AWS cloud services (ECS, EC2, CloudWatch, ELB, CodeBuild, CodePipeline).
Strong expertise in terraform or CloudFormation, with a strong willingness to learn new technologies.
Proficient in at least one scripting or programming language (Python, JavaScript, Ruby, Bash).
Solid understanding of network protocols (TCP/UDP), AWS VPC networking, ports, and security groups.
Familiarity with containerisation technologies, particularly Docker.
Experience building, deploying, and maintaining resilient, highly available, monitored systems.
Good knowledge of cybersecurity principles and secure system design.
Comfortable working with CI/CD pipelines using tools like Jenkins, GitHub Actions, Concourse, or CodePipeline.
Experience with Linux operating systems and web application technologies.
Ability and willingness to document work clearly and share knowledge across technical and non-technical audiences.
Strong problem-solving skills with a proactive approach to identifying and resolving complex issues.
Comfortable working independently and collaboratively (pair programming, agile teamwork).
Excellent communication skills including explaining technical issues to non-technical stakeholders.
Willingness to engage in customer-facing support through ticketing tools such as Zendesk.
Experience working in agile environments and ability to prototype and iterate on new solutions.
Desirable (not essential but advantageous)
Experience with RADIUS or network engineering.
Leading or contributing to engineering improvement projects.
Passion for improving public sector IT services and working within a collaborative, inclusive team.
Empathetic, supportive, and adaptable mindset.
Working environment
Bristol, London, Manchester
About the jobJob summaryGovWifi is a government-critical service that enables secure, consistent WiFi access across the UK public sector, supporting staff and visitors in thousands of locations. We’re looking for a skilled DevOps Engineer to help keep this high-profile platform reliable, secure, and future-ready.
You’ll work with a multi-disciplinary team to maintain service availability, automate infrastructure, and deliver improvements. From deploying secure solutions in AWS to strengthening monitoring and incident response, you’ll play a vital role in keeping GovWifi resilient at scale.
If you enjoy solving complex problems, collaborating with diverse teams, and want your engineering skills to directly benefit the public sector, this is your opportunity to make real impact on a service used nationwide.
Apply to this job!
Find your next job from +1,000 jobs!
-
Manage your visibility
Salary, remote work... Define all the criteria that are important to you.
-
Get discovered
Recruiters come directly to look for their future hires in our CV library.
-
Join a community
Connect with like-minded tech and IT professionals on a daily basis through our forum.
Site Reliability Engineer - CO - G7
Government Digital & Data