Network Support and Maintenance Plan

New companies and small teams do not think about support and maintenance plan until something forces them to think about it. It is good practice to have documentation of your computing environment, have procedures and thinking about other areas such as disaster recovery. It is also important to have a backup person to go to if your primary IT person is not around. Here is an outline of a plan which covers the main components.

Start with a description of your current infrastructure:

[] Inventory all equipment.

[] Make a list of dealers contact data and account numbers.

[] Create a detailed network map.

[] Make a list of tools and applications you are currently using and what they do.

[] Document any special configuration.

Hint: Consider taking a picture of the setup to help other visually identifying the setup ... Of course this really only helps if you lab is not row after row-by-row in the same rack-mounted guys. :)

Preventive Maintenance:

List activities you can do here. For example, think about life expectancy of the equipment budget for new equipment, additional capacity to the system (planning for expansion) and how to find deficiencies in your system.

Maintenance Procedures:

[] Start a logbook must be kept of all activities.

[] Set a time (downtime) when you can take down the network. Example: The last Friday of the month 01:00 to 03:00.

Server Maintenance

[] Check Hardware - clean when necessary (machines, cables, switches, battery backup).

[] Court logs for errors.

[] Run MS Baseline Security.

[] Make sure Anti-virus protection updated.

[] Disk Cleanup & Defrag.

Ask the question: How many of these activities can be automated using Windows Task Scheduler?

Updates

Install only during downtime. Before you run the updates be sure to backup machine. Plan what updates you need to run (Microsoft, Firmware and other tools).

Try: Windows Server Update Services (WSUS).

Health and Performance Monitoring

[] What information would you like to collect?

Current performance, identify incidents, unusual traffic / congestion, etc.

[] What are you currently using?

[] What are the steps or checks to be done here?

Hint: Run the test on a healthy network and collect data that you can use to compare with results in the future.

Power outage

In the event of a power outage, what happens to your computing environment. Sure you have a UPS, but if power fails and you are not around to correct shutdown of the system, then what? Consider creating a program that detects when the UPS battery is on and the automatic shutdown of the machine. Finally, to test to see if UPS works.

Backups

[] How are you keep backups?

[] Write instructions for labeling and storage of all backup files.

[] Test the restore process, make sure the backup works.

[] Create a schedule and have it set to automatically run.

Disaster Recovery Plan:

Create scenarios is probably the best way to understand the dependencies, the effects of a failure, and identify critical components. For each scenario, think about how you want to restore data if there is a redundant system to switch to pirorities etc.

Who is responsible for what?

Last but not least ... assign people to the task you define.