Simplify, Simplify, Simplify

An absolutely crucial rule to keep in mind when troubleshooting PC system problems is that if there are too many unknowns, it is impossible to determine which one is causing the problem. If you have many possible causes for some difficulty with your system, it can be extremely difficult to narrow down the cause of the problem to any one of them. If you are using a just-installed hard disk with an unfamiliar shareware file utility running on an operating system you just upgraded last week, and now your system has problems, how on earth are you ever going to know what is causing them?

In order to have a fighting chance at figuring out what is going on, you must simplify the situation as much as possible so that it becomes much more obvious what is responsible for the difficulty. This means reducing the number of variables to whatever degree possible. One important way of doing this is undoing or double-checking any changes made to the system.

In addition, I have identified the following items as often being responsible for erratic behavior that can complicate troubleshooting. I would recommend that they be eliminated or temporarily disabled when trying to diagnose a system problem:

  • Power Management: Power management is a great idea in theory but in many ways is just "not ready for prime time". Power management routines can cause symptoms that appear to be hardware malfunctions, such as screens that turn off unexpectedly or hard disks that spin down. They also can cause crashes of software that doesn't know how to deal with them. If you want to use power management, turning it off until the problem is resolved is wise.
  • Overclocked Hardware and Aggressive BIOS Settings: I do not believe in overclocking. If you insist on doing it, don't be surprised if you have system problems! Scale things back until you can figure out what the problem is. Similarly, if you are "pushing the envelope" in trying to squeeze maximum performance by tuning your BIOS memory timings and other settings very aggressively, try resetting them to more conservative values when troubleshooting.
  • Experimental or Beta Software: This software is still in the test process and is likely to have bugs--that is why it is labeled as "beta"! For an end application this is usually no big deal, since any crashes or other problems will be limited to that application and therefore somewhat obvious. Running beta operating systems, drivers or other low-level software however is asking for trouble, and you should try to eliminate these possible sources of confusion when trying to debug your system.
  • "Creative" Configurations: The more "unusual" things that you have going on in your system, the more likely that you are going to have a conflict caused by one of these strange pieces of hardware or software. A system that is loaded with unusual utilities, terminate-and-stay-resident programs, an old 8-bit network card salvaged from a 286, etc. will often have more problems than a stock Pentium box with a normal Windows 95 installation. To whatever extent possible, disable these items while troubleshooting. Also try to avoid using unusual low-level software whenever possible.
  • Excessive Connections: If the PC is on a network or is connected to a large number or variety of peripheral devices, you may want to try to disconnect those and see if there is any impact on the problem.

In general you want to avoid the unusual or the unknown when troubleshooting. One way to simplify the software environment during diagnosis is to use a boot floppy to "boot clean" and bypass the special drivers and software that you normally load when you boot from your hard disk. You can also use the {F8} key when DOS or Windows 95 are booting to bypass your startup files, basically accomplishing the same thing. You want to be especially wary of software that sits in the background and activates without you specifically telling it to, as this can confound your troubleshooting efforts.

To whatever extent possible, disable as much as you can when trying to figure out a problem. The more funky software utilities, screen savers and cute peripherals you disable now, the more chance you have of finding out which one it is that is causing the problem later on.

