SRE Anti-Pattern: "I'm An Expert, I Don't Check the Wiki."
Damon Edwards     September 17, 2018 Rundeck, Self-Service Operations, SRE

In this edition of SRE Anti-Patterns, I'm highlighting one of the more substantial shortcomings of written documentation — it is difficult to get people to read it!

SRE Anti-Pattern: "I Could Fix It, If I Could Get To It"

In this edition of SRE Anti-Patterns, I'm highlighting the common enterprise problem of disjointed access. Often the people responding to an incident are blocked from taking the required recovery actions even though they have the first-hand knowledge and experience needed to know what to do.

Operations: The Last Mile Problem For DevOps

The last mile problem for DevOps is Operations. "The last mile" is an economic concept, born in the telco industry, that describes the last bit of effort that is required to extract the benefit of significant previous investments. The metaphor is a good fit for the relationship between DevOps and Operations.

Tickets Make Operations Unnecessarily Miserable

IT Operations has always been difficult. There is always too much work to do, not enough time to do it, and frequent interrupts. Moreover, there is the relentless pressure from executives who hold the view that everything takes too long, breaks too often, and costs too much.

In search of improvement, we have repeatedly bet on new tools to improve our work. We’ve cycled through new platforms (e.g., Virtualization, Cloud, Docker, Kubernetes) and new automation (e.g., Puppet, Chef, Ansible). While each comes with its own merits, has the stress and overload on operations fundamentally changed?

Enterprises have also spent the past two decades liberally applying Management frameworks like ITIL and COBIT. Would an average operations engineer say things have gotten better or worse?

In the midst of all of this, there is conventional wisdom that rarely gets questioned.

Rundeck for SRE: Create Standard Operating Procedures and Enable Operations as a Service (Video)

This video tells the story of a typical SRE journey with Rundeck. First, using Rundeck for creating standard operating procedures and checklists. Second, using Rundeck to safely enable Operations as a Service so others who are traditionally outside of the operations organization can execute operations procedures.

SREs Expanding Their Use of Rundeck: "Helping Me" + "Helping You"

It has been interesting to watch how Rundeck spreads in organizations who are also adopting SRE practices. The emerging role of SRE (Site Reliability Engineering) focuses on using software engineering skills to build and operate highly-reliable and highly-scalable services.

Their Rundeck usage can be divided up into two categories that I call "helping me" and "helping you".