Monthly Archives - January 2017

Think before you execute

How system admins can use planning and checklists to improve quality

Introduction

Sounds like a right statement to make an obvious but in this article I will look at how it applies practically to a system admin

It starts at the beginning

When a task (in the form of a incident request or a service request) comes to a system admin, this is when the principle starts – how deep do you think. My observation is that a natural thing to do is to take a superficial view of the task and then either a google search is performed or a quick think is done to find commands will resolve the task. The fundamental issue with this approach is that it is solution centric not problem centric. i.e. not enough time is spent in understanding the problem and coming up with a solution. Another problem with the approach is that there is a rush to execute as being in front of the system and executing is what will make things work.

Thinking before executing addresses both of these

  • Has the problem well understood?
    If the problem is not well understood, the advice on google would not be understood and only coincidentally will you solve the problem. Quite likely the problem will resurface. If on the other hand the problem is well understood, the solution quite often will be obvious or the advice on the internet will make sense.
  • Once the problem is understood, a detailed plan is needed.
    Most tickets require multiple steps to solve. If each step is not well documented, you are forced to plan while executing – in our experience very difficult.
  • Our approach is to come up with a detailed plan before executing. It is a modified and more comprehensive checklist. We have SOP (Standard Operating Procedures) for common tasks, but this works even if we do not have a SOP. We use a spread sheet.
    • At the top we have information we will need before we start executing. This includes information such as IP (internal or external) of the systems we wiil use, passwords if any that we will need, etc. Here is a sample of a sheet we use for rehosting. info_sheet
    • Below that is a table that is both a planning as well as a execution tool. The table has the following headers :
      • Function – overall heading for a set of commands
      • Step details / reference – SOP reference and commands to be executed
      • Command – the actual command that will be executed. Note the commands are created before they are executed.
      • Status – marked upon execution
    audit_trail

    Conclusion

    Checklists help being purposeful in execution and the result is a audit trail that can be used to debug issues later on.