Senior Site Reliability Engineer (SRE)

Back to all jobs
  • Experian
  • Nottingham, East Midlands
  • Full-Time
  • 4 days ago
Published
May 5, 2026
Location
Nottingham, United Kingdom
Category
Job Type

Senior Site Reliability Engineer (SRE): our view in 3 lines...

  • The Role: An SRE role to improve reliability and performance of business-critical AWS-hosted systems within a distributed engineering environment.
  • The Person: The SRE will manage and operate AWS infrastructure, maintain CI/CD and IaC, respond to incidents and perform RCA, implement automation to reduce toil, and configure monitoring and resiliency for production systems.
  • Requirements: Hands-on experience with AWS services, CI/CD pipelines, Terraform or CloudFormation, SRE principles, monitoring tools such as CloudWatch Prometheus Grafana Splunk or Dynatrace, Python or Bash scripting, and Linux troubleshooting are required.

Job Description

Company Description

Experian is a global data and technology company, powering opportunities for people and businesses around the world. We help to redefine lending practices, uncover and prevent fraud, simplify healthcare, create marketing solutions, and gain deeper insights into the automotive market, all using our unique combination of data, analytics and software. We also assist millions of people to realize their financial goals and help them save time and money.

We invest in people and new advanced technologies to unlock the power of data. As a FTSE 100 Index company listed on the London Stock Exchange (EXPN), we have a team of 22,500 people across 32 countries. Our corporate headquarters are in Dublin, Ireland. Learn more at experianplc.com.

Internal Grade: EB9/E.

Job Description

We are looking for a Site Reliability Engineer (SRE) to improve the reliability, and performance of business-critical systems. You will focus on AWS cloud infrastructure, DevOps tooling, and core SRE practices within a distributed, production environment. Reporting to our Lead, you will work with development, platform, and operations teams to ensure systems are stable, scalable, well-monitored and meet defined reliability targets.

Main Responsibilities

Reliability and Operations:

  • Support high availability, scalability and performance of production systems
  • Work with defined SLIs, SLOs and SLAs, ensuring services meet agreed reliability targets
  • Identify and reduce operational toil through automation and process improvement
  • Contribute to the design and implementation of fault-tolerant and resilient systems
  • Participate in resilience and failure testing activities to validate system behaviour under fault conditions and improve recovery

AWS & Cloud Operations:

  • Manage and operate systems hosted on AWS (EC2, EKS/ECS, RDS, S3, Lambda, CloudWatch, IAM, and VPC)
  • Support cloud deployments and infrastructure changes following best practices
  • Help with backup, disaster recovery and resiliency planning

DevOps & Automation:

  • Work with CI/CD pipelines and DevOps practices to ensure reliable and repeatable deployments, including build, test and release automation processes
  • Use Infrastructure as Code tools such as Terraform or CloudFormation to manage and provision infrastructure
  • Develop automation using scripting languages (Python, Bash or similar) to reduce operational toil and improve efficiency

Incident Management:

  • Participate in production incident response, troubleshooting, and service restoration
  • Perform root cause analysis (RCA) and contribute to post-incident reviews
  • Help implement preventive actions to avoid incident recurrence

Observability:

  • Configure and maintain monitoring, logging, and alerting using tools like CloudWatch, Prometheus, Grafana, Splunk, or Dynatrace
  • Develop dashboards to track system and platform health and reliability metrics across the user journey
  • Improve alert quality to reduce noise and improve response times

Collaboration:

  • Work with application and engineering teams to embed reliability into system design
  • Collaborate within a globally distributed team, using clear handovers to ensure continuity
  • Share knowledge and contribute to team-wide best practices
  • Communicate with all kinds of stakeholders, influencing decisions through reliability-focused insights

Qualifications

  • Experience in production support, DevOps, SRE, cloud operations, or systems engineeringCloud Expertise
  • Hands-on experience with AWS cloud services, including compute, container and serverless workloads
  • Practical experience with CI/CD pipelines and DevOps practices, including Git-based version control, pull request workflows, code reviews, and deployment automation
  • Experience with SRE principles, monitoring, and reliability engineering practices
  • Proficiency in scripting (Python, Bash, or similar) for automation and operational tooling
  • Experience with Linux systems and troubleshooting production issuesAdditional

Preferred Experience

  • Exposure to data platforms and data pipelines
  • Understanding of data reliability concepts
  • Experience supporting or operating complex distributed systems

Additional Information

Benefits package includes:

  • Hybrid working
  • Great compensation and discretionary bonus
  • Core benefits include pension, Bupa healthcare, Sharesave scheme and more
  • 25 days annual leave with 8 bank holidays and 3 volunteering days. You can purchase additional annual leave.

We take our people agenda very seriously and focus on what matters; DEI, work/life balance, development, authenticity, collaboration, wellness, reward & recognition, volunteering... the list goes on. Experian's people first approach is award-winning; World's Best Workplacesâ„¢ 2024 (Fortune Top 25), Great Place To Workâ„¢ in 24 countries, and Glassdoor Best Places to Work 2024 to name a few. Check out Experian Life on social or our Careers Site to understand why.

Experian is proud to be an Equal Opportunity and Affirmative Action employer. Innovation is an important part of Experian's DNA and practices, and our diverse workforce drives our success. Everyone can succeed at Experian and bring their whole self to work, irrespective of their gender, ethnicity, religion, colour, sexuality, physical ability or age. If you have a disability or special need that requires accommodation, please let us know at the earliest opportunity.

Experian Careers - Creating a better tomorrow together

Find out what its like to work for Experian by clicking here

#LI-Hybrid

This is a hybrid remote/in-office role.

Experian Careers - Creating a better tomorrow together

Find out what its like to work for Experian by clicking here

  • Employee Status: Regular
  • Role Type: Hybrid
  • Department: Technology
  • Schedule: Full Time
  • Key Skills
    ? Key Skills in dark blue have been inferred based on similar industry roles
    AWS (EC2 EKS/ECS Lambda S3 RDS) Cloudformation CI/CD Pipelines Bash Grafana Splunk AWS Terraform CI/CD Python Prometheus

    Subscribe to Career Resources

    Get the latest career advice, industry insights, and job opportunities delivered to your inbox.