Session Name: The E Stands for Enablement - Modern SRE at Pagerduty
PagerDuty led with a digital transformation model of Full-Service Ownership - meaning our teams take responsibility for supporting software they deliver, at every stage of the software/service lifecycle. That level of ownership brings development teams much closer to their customers, the business, and the value being delivered. But like many start-ups, as we scaled and have grown as a company, the amount of technologies and skills teams need to know or support grew too. We hit a clear point where teams had too many things to manage, creating toil. So we re-evaluated what our Site Reliability Organization needed to enable PagerDuty to scale without losing the full-service ownership model that has made our teams so successful.
Enter our new framework for Modern Site Reliability. We deliver three types of capabilities, Enablement and Education, Platform, and Core Infrastructure. We evaluated places where we found developer toil and looked at high-impact automations that would reduce hours worth of work to minutes. We revisited what services teams needed to own and what services made more sense as core platform and infrastructure. We have a new framework that will drive our future roadmap and goals for SRE at PagerDuty.
In this session, Paula will share how we used these three categories to evaluate and redesign our SRE Platform services and roadmap to better enable the organization to help with developer effectiveness, increasing reliability, and scaling the organization.
Paula Thrasher, Senior Director of Infrastructure and Shared Services at PagerDuty joined the company in January 2021. Prior to that, she has led DevOps transformations at a variety of large enterprise companies and the federal government including United Technologies (Raytheon) and CSRA (GDIT). Paula previously spoke at All Day DevOps in 2017 and has also spoken at RSA DevSecOps, DevOps Enterprise, and a variety of other gatherings of great DevOps minds.