The Philosophy of System Administration

I found this articles in Red Hat Enterprise Linux Introduction to System Administration hand book. I highlight it here for knowledge sharing.

Although the specifics of being a system administrator may change from platform to platform, there are underlying themes that do not. These themes make up the philosophy of system administration.

 The themes are:

  • Automate everything
  • Document everything
  • Communicate as much as possible
  • Know your resources
  • Know your users
  • Know your business
  • Security cannot be an afterthought
  • Plan ahead
  • Expect the unexpected

Automate Everything

Most system administrators are outnumbered either by their users, their systems, or both. In many cases, automation is the only way to keep up. In general, anything done more than once should be examined as a possible candidate for automation.

Here are some commonly automated tasks:

  • Free disk space checking and reporting
  • Backups
  • System performance data collection
  • User account maintenance (creation, deletion, etc.)
  • Business-specific functions (pushing new data to a Web server, running monthly/quarterly/yearly reports, etc.)

This list is by no means complete; the functions automated by system administrators are only limited by an administrator’s willingness to write the necessary scripts. In this case, being lazy (and making the computer do more of the mundane work) is actually a good thing.

Automation also gives users the extra benefit of greater predictability and consistency of service.

Document Everything

If given the choice between installing a brand-new server and writing a procedural document on performing system backups, the average system administrator would install the new server every time.

While this is not at all unusual, you must document what you do. Many system administrators put off doing the necessary documentation for a variety of reasons:

“I will get around to it later.”
Unfortunately, this is usually not true. Even if a system administrator is not kidding themselves, the nature of the job is such that everyday tasks are usually too chaotic to “do it later.” Even worse, the longer it is put off, the more that is forgotten, leading to a much less detailed (and therefore, less useful) document.

“Why write it up? I will remember it.”
Unless you are one of those rare individuals with a photographic memory, no, you will not remember it. Or worse, you will remember only half of it, not realizing that you are missing the whole story. This leads to wasted time either trying to relearn what you had forgotten or fixing what you had broken due to your incomplete understanding of the situation.

“If I keep it in my head, they will not fire me. I will have job security!”
While this may work for a while, invariably it leads to less not more job security. Think for a moment about what may happen during an emergency. You may not be available; your documentation may save the day by letting someone else resolve the problem in your absence. And never forget that emergencies tend to be times when upper management pays close attention.

In such cases, it is better to have your documentation be part of the solution than it is for your absence to be part of the problem. In addition, if you are part of a small but growing organization, eventually there will be a need for another system administrator. How can this person learn to back you up if everything is in your head? Worst yet, not documenting may make you so indispensable that you might not be able to advance your career. You could end up working for the very person that was hired to assist you.

Hopefully you are now sold on the benefit of system documentation. That brings us to the next question: What should you document? Here is a partial list:

  • Policies
  • Procedures
  • Changes

All of these changes should be documented in some fashion. Otherwise, you could find yourself being completely confused about a change you made several months earlier.

Communicate as Much as Possible

When it comes to your users, you can never communicate too much. Be aware that small system changes you might think are practically unnoticeable could very well completely confuse the administrative assistant in Human Resources.

The method by which you communicate with your users can vary according to your organization. Some organizations use email; others, an internal website. A sheet of paper tacked to a bulletin board in the breakroom may even suffice at some places. In any case, use whatever method(s) that work well at your organization.

In general, it is best to follow this paraphrased approach used in writing newspaper stories:

  • Tell your users what you are going to do
  • Tell your users what you are doing
  • Tell your users what you have done

Know your Resources

System administration is mostly a matter of balancing available resources against the people and programs that use those resources. Therefore, your career as a system administrator will be a short and stress-filled one unless you fully understand the resources you have at your disposal. Some of the resources are ones that seem pretty obvious:

  • System resources, such as available processing power, memory, and disk space
  • Network bandwidth
  • Available money in the IT budget

But some may not be so obvious:

  • The services of operations personnel, other system administrators, or even an administrative assistant
  • Time (often of critical importance when the time involves things such as the amount of time during which system backups may take place)
  • Knowledge (whether it is stored in books, system documentation, or the brain of a person that has worked at the company for the past twenty years)

It is important to note is that it is highly valuable to take a complete inventory of those resources available to you and to keep it current a lack of “situational awareness” when it comes to available resources can often be worse than no awareness at all.

Know Your Users

Users are those people that use the systems and resources for which you are responsible no more, and no less. As such, they are central to your ability to successfully administer your systems; without understanding your users, how can you understand the system resources they require?

For example, consider a bank teller. A bank teller uses a strictly-defined set of applications and requires little in the way of system resources. A software engineer, on the other hand, may use many different applications and always welcomes more system resources (for faster build times). Two entirely different users with two entirely different needs.

Make sure you learn as much about your users as you can.

Know Your Business

Whether you work for a large, multinational corporation or a small community college, you must still understand the nature of the business environment in which you work. This can be boiled down to one question:

What is the purpose of the systems you administer?

The key point here is to understand your systems’ purpose in a more global sense:

  • Applications that must be run within certain time frames, such as at the end of a month, quarter, or year
  • The times during which system maintenance may be done
  • New technologies that could be used to resolve long-standing business problems

By taking into account your organization’s business, you will find that your day-to-day decisions will be better for your users, and for you.

Security Cannot be an Afterthought

No matter what you might think about the environment in which your systems are running, you cannot take security for granted. Even standalone systems not connected to the Internet may be at risk (although obviously the risks will be different from a system that has connections to the outside world).

Therefore, it is extremely important to consider the security implications of everything you do. The following list illustrates the different kinds of issues you should consider:

  • The nature of possible threats to each of the systems under your care
  • The location, type, and value of the data on those systems
  •  The type and frequency of authorized access to the systems

While you are thinking about security, do not make the mistake of assuming that possible intruders will only attack your systems from outside of your company. Many times the perpetrator is someone within the company. So the next time you walk around the office, look at the people around you and ask yourself this question:

What would happen if that person were to attempt to subvert our security?

Plan Ahead

System administrators that took all this advice to heart and did their best to follow it would be fantastic system administrators for a day. Eventually, the environment will change, and one day our fantastic administrator would be caught. The reason? Our fantastic administrator failed to plan ahead.

Certainly no one can predict the future with 100% accuracy. However, with a bit of awareness it is easy to read the signs of many changes:

  • An offhand mention of a new project gearing up during that boring weekly staff meeting is a sure sign that you will likely need to support new users in the near future
  • Talk of an impending acquisition means that you may end up being responsible for new (and possibly incompatible) systems in one or more remote locations

Being able to read these signs (and to respond effectively to them) makes life easier for you and your users.

Expect the Unexpected

While the phrase “expect the unexpected” is trite, it reflects an underlying truth that all system administrators must understand:

There will be times when you are caught off-guard.

After becoming comfortable with this uncomfortable fact of life, what can a concerned system administrator do? The answer lies in flexibility; by performing your job in such a way as to give you (and your users) the most options possible. Take, for example, the issue of disk space. Given that never having sufficient disk space seems to be as much a physical law as the law of gravity, it is reasonable to assume that at some point you will be confronted with a desperate need for additional disk space right now.

What would a system administrator who expects the unexpected do in this case? Perhaps it is possible to keep a few disk drives sitting on the shelf as spares in case of hardware problems. A spare of this type could be quickly deployed on a temporary basis to address the short-term need for disk space, giving time to more permanently resolve the issue (by following the standard procedure for procuring additional disk drives, for example).

By trying to anticipate problems before they occur, you will be in a position to respond more quickly and effectively than if you let yourself be surprised.

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *