Sessions: Site Reliability Engineering: Anti-Patterns in Everyday Life and What They Teach Us
Real world experience and things that go wrong are two of life’s best teachers. This talk will explore key elements of scalable large-system design and Site Reliability Engineering (SRE) principles* through anti-patterns encountered in real life. Find out what lessons can be gleaned from watching the dynamics in a crowded cafe or dealing with a security issue during a hotel stay. Learn about fundamental site reliability engineering principles and practices including:
Avoiding cascading failures
Not feeding the machines with human toil
Writing blameless postmortems
Engineering solutions to eliminate classes of errors rather than implementing point fixes
These principles will be framed through a lens of the suboptimal while demonstrating the impact of SRE anti-patterns on user trust.
* SRE is often thought of as a specific implementation of the DevOps interface.
Speaker/Track Organizer Bio:
Jennifer Petoff is Google's Director of SRE Education and is based in Dublin, Ireland. She leads the SRE EDU program globally and is one of the co-editors of the best-selling book, Site Reliability Engineering: How Google Runs Production Systems.