Self-Service: Making Operations More Secure and Less Risky

"We would love to give our Devs, QA, and Analysts self-service access to privileged Ops tasks. The capability sure would help these teams move quicker and lighten the load on Operations. But alas, Security and Compliance would never allow it, so we don't push for it."

There appears to be a pre-conceived notion that self-service — especially self-service between teams with different levels of privilege — is inherently risky and therefore not an option. However, if you dig in, you'll see that this belief is mostly folklore. Self-service isn't inherently dangerous or out of control. In fact, the case can be made that the opposite is true.

Maybe I have just been lucky in my travels, but I haven't found security or compliance professionals to be unreasonable or categorically against the idea of self-service. The objections I've heard come from how the proposed self-service will be implemented.

Self-service done right should reduce risks and improve compliance controls. If we have the right conversations upfront and considering the needs of security and compliance in the design, we can make security and compliance our allies.

First, let's consider the traditional ticket-driven status quo.


Where are the weaknesses? Most can be traced back to the harmful effects of silos and request queues.

  • The requester (the person who needs the ops task executed) and the doer (the person who does the ops task) are disconnected and working out of context from each other. This leads to slower feedback loops and more incidents of miscommunication. In addition to increasing the likelihood of errors and outages, this also increases security and compliance risks.
  • The doer is continuously context-switching from one request to another, increasing the likelihood of errors and increases security and compliance risks.
  • The doer often uses multiple tools or manually enters commands or kicks off scripts to execute the task. This semi-manual or manual work can both introduce unexpected variation (increasing risk) and is difficult to verify or audit (also increasing risk). Often what the doer writes in a comment of the ticket is the only easy-to-obtain record of what transpired.
  • The doer is often working at full capacity (more side effects of silos and queues) and buried under high amounts of toil. As a result, the doer has little to no time to focus on improving automation that will either do the work for them or collect compliance evidence. This condition keeps the doers work a series of semi-manual or manual one-offs (making the doers work more risky and less under control).

So of course, when self-service is presented as simply letting the requester (often from a lower privileged team) have direct access to doing the privileged ops tasks, the reaction is usually quite negative.




In contrast, if you define your self-service correctly, not only do the above-mentioned problems mostly go away, you end up improving your security posture and compliance controls.

Implementing an Operations as a Service design pattern gives you a mechanism by which you can formalize security and compliance controls while removing ticket-driven request queues.




The requesters of ops tasks (Dev, QA, Release, NOC, etc.) get access to specific pre-defined procedures (respecting both least privileged access and separation of duties requirements). The requesters can (and should be encouraged to) define the procedures as part of their definition of done for their work (e.g., Devs write procedures to manage the services they build). Ops, Security, and Compliance reviews and approves new procedures and decides who will be permitted to execute which procedures. At all times, Ops, Security, and Compliance stay in control of policies and the built-in evidence collection needed to help them perform effective oversight.

Done correctly, everybody wins with Operations as a Service. The organization can safely move faster because of pervasive, but well controlled, self-service. You really can go faster and lock things down.



enlarge image