Learn about the technologies behind the Internet with The TCP/IP Guide!|
NOTE: Using robot software to mass-download the site degrades the server and is prohibited. See here for more.
Find The PC Guide helpful? Please consider a donation to The PC Guide Tip Jar. Visa/MC/Paypal accepted.
|View over 750 of my fine art photos any time for free at DesktopScenes.com!|
Early processors executed instructions entirely sequentially; the first instruction began execution, completed, and then the next one started. The problem with this is that it is extremely inefficient, since execution occurs in steps. Having to wait for all the steps to complete for instruction 1 before starting instruction 2 would be like having an assembly line where the worker at the start of the line had to wait for the worker at the end to complete the first part before starting on the second. At any given time, every worker on the line but one would be doing nothing.
Of course the whole purpose of an assembly line is to prevent this from happening. The worker at the start of the line performs his or her task on the part, hands it off to the worker next on the assembly line, and then starts work on the next part. A flow of parts goes down the line, with each worker always having something to do. Modern processors do exactly the same thing with instructions; the first step of execution is performed on the first instruction, and then when the instruction passes to the next step, a new instruction is started. This process is called pipelining (after another analogy to the assembly line, a pipeline where you keep product flowing through the pipe). The steps in the pipeline are often called stages.
Pipelining leads to dramatic improvements in system performance, as you can well imagine, compared to allowing much of the processor circuitry to lie idle as with sequential execution. The more stages that you can break the pipeline into, the more theoretical speed you can get from it. For example, let's suppose it takes 12 clock cycles to handle all the steps to process an instruction. In theory, if you use a 4-stage pipeline, your maximum throughput is 1 instruction every 3 cycles. But if you use a 6-stage pipeline, maximum throughput is 1 instruction every 2 cycles. (This is of course highly simplified).
Pipelining also has some drawbacks of course. One of these is complexity; there is a lot more work for the processor to do to keep the pipeline moving. Other problems relate to data dependencies. Let's take a very simple 2-line program as an example:
Can you see how pipelining would cause a problem with this (very common kind of) code fragment? The processor will start executing the second instruction before the first one is finished, but it needs the results from the first instruction in order to execute the second one! A pipelining processor will of course detect and handle this condition, but in the worst case it must be handled by waiting for the first instruction to finish before proceeding with the second one. This condition is called a pipeline stall and leads to reduced performance. Newer processors have special performance-enhancing features to partially eliminate this sort of problem. In general, the processor wants to keep the pipeline "flowing" as much as possible, since when the pipeline stalls performance decreases.
Next: Compiler Optimization