In operations, we often find ourselves dominated by the urgent. The site is down *right now*! All hands on deck! Much has been said about the dangers of pager fatigue, toil and urgent tactical work. We in Site Reliability Engineering pride ourselves in being aware of this, being proactive and not reactive. It turns out, though, that this proactivity has limits. In this talk, we would like to tell a story about the far end of the spectrum; work that is critically important but has a long time horizon. Most organizations are not set up to handle this well, SRE included.
Michael has been an SRE at Google since 2007 working primarily on infrastructure systems.