iterator (idea) by wharfinger

The word "iterator" means different things in different contexts and programming languages, but it's always got something to do with visiting each one of a set of objects, where "object" doesn't necessarily mean "class instance": Just take "object" to mean "some instance of some data type, somewhere".

In C++, you usually hear the term used with regard to the container classes belonging to the Standard Template Library (commonly called "the STL").

In STL terms, an "iterator" has "pointer semantics": It looks and smells a lot like a pointer. Sometimes it is a pointer, when that's appropriate, but other times it isn't (as for the ones that are classes, some implementations will inherit some or all of them from a common base class, and others won't; this template stuff is generic programming, not OOP, and inheritance is sort of moot). The purpose is to provide an abstraction of pointer behavior. The container you've got may be an associative array (std::map), a one-dimensional dynamic array (std::vector), or a linked list (std::list). Whatever it is, the library guarantees that you can get an iterator to the first element in the list by calling the container's begin() function, and it guarantees that you can call its end() function to get a "NULL" value: If you've got an iterator belonging to a given instance foo of an STL container, all invalid iterators belonging to foo will be equal to foo.end().

In theory, STL iterators do what you'd expect a pointer to do, but in practice they tend to have limits, which we'll discuss later. Their interface uses pointer operators: You dereference an iterator to get the thing it points to; you increment it with the ++ operator (pre or post), and so on.

The upshot of all this is that you can write generic algorithm code that neither knows nor cares what kind of container it operates on, because it interacts with the container through this "iterator" abstraction. You can write the following code to traverse a series of elements contained in any given STL container class (and when you stop to think about it, this will work just fine on old-fashioned int foo[], too):

//  If you don't understand C++ templates reasonably well, 
//  this may be somewhat perplexing.
template< class iter_T >
void dump( iter_T from, iter_T to )
{
    for ( /**/; from != to; ++from )
        cout << *from;  //  ...or whatever floats your boat, really.
}

And here's how you'd use it:

    std::vector< int >  foo;
    std::list< double > bar;

    //  ...put something in the containers...

    //  Spray it at the user:
    dump( foo.begin(), foo.end() );
    dump( bar.begin(), bar.end() );

The std::vector iterators are almost certainly pointers in your implementation. Other iterators aren't; for a linked list class template, you might have something like this:

    //  T is the contained type, element_T is the list-element type.
    //  This is right off the top of my head.
    template < class T, class element_T >
    class iterator {
    public:
        ...
        //  preincrement
        iterator & operator++() {
            //  No sanity checking. That's the user's problem.
            _el = _el->next;
            return *this;
        }

        T & operator*() {
            return _el->data;
        }
        ...

    private:
        element_T * _el;
    }

Some iterators are "bidirectional"; some aren't. If you've got a singly-linked list, you can't go home again, so to speak: It's convenient to walk the list in the forward direction, but walking a singly-linked list backwards would be monstrously inefficient. To get from the fortieth element to the thirty-ninth, you'd have to start over at the first element and walk all the way back up to number thirty-nine. That's loony. You can do it by hand if you like, but the STL won't help you do crazy things like that. You're supposed to be able to assume that any supported operation is reasonably efficient.

Some iterators allow random access, and some don't. Naturally, an iterator belonging to a linked list or a tree won't do that: Again, that operation is too inefficient to take seriously. On the other hand, std::vector's iterators do provide random access:

    //  Stupid, but legal.
    std::vector< int >    foo( 100 );

    int n = *(foo.begin() + 23);    //  Pointer syntax, remember?

That's because it's easy and cheap to figure an offset in an array, which is what std::vector really is. You can see we're not talking about real theoretically pure, industrial strength abstraction here. In C++, reality is always acknowledged. LISP aggressively disregards reality, but that's a different language.

Some other languages have things they call iterators, too. One that springs to mind is Ruby, an object oriented scripting language native to the romantic faraway island of Japan, which K&R C programmers still call "Cipangu".

Ruby's "iterators" don't look much like their namesakes in C++. They're a little spoonful of syntactic sugar involving closures. They look like this:

    #   Print 0 through 2
    #   times() is a member function of Ruby's number class (FixNum, 
    #   for smaller numbers). In Ruby, number and string literals are 
    #   class instances.
    3.times { |x| puts( x ); }

    #   step() is another one: Count from this up to A in steps of 
    #   size B. This will print 0, 3, 6, 9, and 12
    0.step( 12, 3 ) { |x| puts( x ); }

    #   IO is the stream abstraction class. Its static member function
    #   foreach() reads a file as lines of text, and yields up each 
    #   line in turn to the block of code that you give it.
    i = 0;
    IO.foreach( "inputfile" ) { |line|  #   weird way to pass args, huh?
        print( i, ": ", line );
        #   Infuriatingly, Ruby doesn't support ++ properly.
        i = i + 1;
    }

They're handy, but they're really just a convenient notation for calling a callback. You could write essentially the same code in any language that supports closures. Take for example the following implementation of the number class's times() iterator:

    #   Ruby
    def FixNum.times()
        i = 0;

        while ( i < self )
            #   yield is used in the sense of "render up", not 
            #   "defer to". A yield statement calls the block of 
            #   code that the user provides, and passes its 
            #   argument(s) to the block.
            yield i;
        end
    end

    //  The same in JavaScript (we're adding a member 
    //  function; JavaScript objects if in need of greater 
    //  clarity)
    Number.prototype.times = function( callback ) {
        for ( var i = 0; i < this; ++i )
            callback( i );
    }

    //  Use it like so (JS doesn't like calling member functions
    //  on integer literals, though it's happy to do the same on 
    //  a string):
    var foo = "urk: ";
    Number( 3 ).times( function( n ) { print( foo + n ); } );

So, okay, big whoop. There's nothing special or magical about Ruby iterators, except for the fact that the designer of the language loves the damn things to distraction. They're ubiquitous in the built-in classes, and in practice that makes the language very pleasant to work with. It's a bit like the ubiquity of the C idioms favored in K&R. It's not really inherently part of the language, it's just the way everybody who uses the language learns to use it. Good manners dictate that I keep my mouth shut about Larry Wall's influence on the Perl community.

C++: no compilers available	for loop	Why C++ rules	Cipangu
C++: brackets to make an object not a function	Storing a doubly-linked list using just a single pointer field	list comprehension	Closure
One equals point nine repeating	Iter	Generator	lambda
Python	C++ input iterator	bounds checking	Pay attention, this will be important later
Babysitting a gigantic five year old	generic programming	refactoring	method
Pointy sub	deque	Learning to program	"i" is a meaningful variable name