Imagine the registers of your CPU (or your program's variables) as a set of horizontal binary strings:

bit 7 6 5 4 3 2 1 0
a = 0 1 1 0 0 0 1 1
b = 0 0 0 1 0 1 1 1
c = 1 0 1 0 1 0 0 1
A vertical counter treats each column as a separate number rather than the rows, allowing you to count 8 (or as many bits as you have) variables in parallel. (The 8 values above are 7,6,2,1,2,5,4, and 1 if you can't see it.)

To increment all of the counters, you xor each row, starting at the most significant, with all of the lower rows anded together and invert the least significant row:

a = a xor (b and c)
b = b xor c
c = not c

Now having 8 parallel counters isn't too useful unless you can selectively increment them. With a little thought, this is fairly easy: add another row at the bottom. As you increment this (now with 4 variables), any bits set in the new row will cause a carry and effectively increment the remaining rows.

Say you want to increment rows 3, 4, and 7:

bit 7 6 5 4 3 2 1 0
a = 0 1 1 0 0 0 1 1
b = 0 0 0 1 0 1 1 1
c = 1 0 1 0 1 0 0 1
d = 1 0 0 1 1 0 0 0   Set to 1 where you want to inc.

a = a xor (b and c and d)
b = b xor (c and d)
c = c xor d
d = not d
You can skip inverting row d, that was only to select which columns to increment. After this, you have:
bit 7 6 5 4 3 2 1 0
a = 0 1 1 0 0 0 1 1
b = 1 0 0 1 1 1 1 1
c = 0 0 1 1 0 0 0 1
Which is 7,6,2,2,3,5,4, and 2. Yay!