Of all the bits of jargon and terminology used in computer architecture, the term 'bubble' is one of the more whimsical, as well as being wonderfully descriptive and almost poetically succinct. The word is used to refer to the phenomenon of idle stages in a processor pipeline, and the name, by analogy, tells us a little about the behaviour of such idle units.

Let's illustrate this with a simple example. Consider a simple 5-stage RISC pipeline, like an early MIPS processor, or the ARM processor in your mobile phone, and a simple program fragment:

Pipeline:

Fetch -> Decode -> ALU -> Memory -> Writeback

...and our program fragment, a simple implementation of the strlen() function:

        mov     r1, r0          ; save pointer
        mov     r0, #0          ; initialise count
loop:   ldrb    r2, [r1, #1]!   ; load byte from memory at r1, increment r1.
        cmp     r2, #0          ; compare byte with '\0'.
        addne   r0, r0, #1      ; increment string length
        bne     loop            ; loop back round if it's not
        mov     pc, lr          ; return from function

(This is obviously some handwritten assembly code rather than the output of a compiler. You can tell because of the comments, and because any compiler worth its salt would have eliminated that 'addne' in the loop in favour of a subtraction outside the loop using the exit value of the induction variable...)

Instructions flow through the pipeline in the obvious order: they're fetched from memory in the Fetch stage. They're decoded in the Decode stage, where the processor works out what else it needs to do in order to execute the instruction. In the ALU stage, simple arithmetic instructions are executed, and the addresses for memory instructions are calculated. In the Memory stage, loads and stores access memory using the addresses they previously calculated in the ALU, and in Writeback, assuming any memory accesses went off without a hitch, the results of the instruction are written back to the register file.

So the instruction sequence above can execute, with the help of a a little register bypassing without much problem for the first few cycles, each instruction following hot on the heels of the previous one, and each pipeline stage busy doing something useful in every cycle. Until, that is, the pipeline hits the following state:

Fetch -> Decode -> ALU  -> Memory -> Writeback
addne    cmp       ldrb    add       mov r0, #0

What happens in the next cycle? At first glance it seems obvious: the add moves to the Writeback stage, the ldrb moves into the Memory stage, and the cmp instruction moves to the ALU. But there's a problem. The cmp instruction needs the value loaded to r0 by the ldrb instruction. But the load instruction hasn't yet entered the Memory stage. The value loaded from memory will only be available at the end of the next cycle, but it's needed as an input to the ALU at the beginning of the cycle.

So the comparison instruction must wait in Decode until the load instruction completes its memory access, moving into the Memory stage.

So during the next cycle, the pipeline looks a little like this:

Fetch -> Decode -> ALU  -> Memory -> Writeback
addne    cmp      (empty)  ldrb      add

During this cycle, the ALU pipeline stage is idle waiting on the result from the load in the Memory stage. It's a gap in an otherwise full, efficiently-operating pipeline.

In the next cycle, though, the memory result is available, and the pipeline can go about its business. Every instruction is free to move on to the next pipe stage:

Fetch -> Decode -> ALU  -> Memory -> Writeback
bne      addne     cmp     (empty)   ldrb

But what's this? Not only have the instructions moved on, but so has the idle pipeline stage. Since there was no instruction in the ALU stage in the last cycle, there is now no instruction to move into the Memory stage. It's an idle pipeline stage, but it moves through the pipeline in a way that makes it look almost like a real instruction.

It's this behaviour that leads us to call these phenomena 'bubbles'. Much like bubbles in a narrow pipe, these gaps move along the pipeline, reducing the efficiency of the processor.

Pipeline bubbles occur any time a pipeline must stall some instruction while waiting for the results of some other instruction, or when a branch is mispredicted, or when cache misses hold things up. In simpler pipelines, they'll also occur on write-after-write and write-after-read dependencies, and things like that.

So! Next time you think your computer is going too slow, you know what to do: check it for bubbles!