pure virtual - Everything2.com

If you know some C++ but you're not sure what a virtual function is, please read that one first. If you don't know C++ at all, read Halspal's stuff instead. He's really good.

In the C++ programming language, a "pure virtual" function is a class member function which is declared but not implemented. It is, of course, declared virtual. If you call a pure virtual function, your program will crash, because there's no function there. It turns out that calling a pure virtual is possible, and even fun, if that's your idea of fun¹.

You declare a pure virtual function by putting "= 0" at the end of the declaration of a virtual function:

    class foo {
        virtual void bar() = 0;
    }

We write pure virtual functions in cases where we want to specify an interface which will be inherited by a subclass, but we are unable and unwilling to specify anything about the implementation behind that interface. A pure virtual has an entry in the vtable that points nowhere. When a class has one or more pure virtual functions, it becomes an "abstract class": What that means is that you can't create an instance of it. What you can do is declare a pointer to it, or a reference to it. You could have an array of pointers to a base class, each pointing to an actual instance of one subclass or another. Say the base class has a pure virtual called foo(): When you buzz down the array calling p->foo() through each pointer, each instance will do its thing, whatever that may be. That also works fine with regular virtual functions. So why bother with pure virtuals rather than ordinary stubs declared virtual?

The key thing about pure virtuals is that all of the pure virtuals in a given base class must be implemented in subclasses. You're defining an absolute requirement: The subclasser must fill in these particular blanks. You're also defining an absolute prohibition: This base class is not a complete piece of code. No, you may not instantiate it, because to do so would be idiotic.

How can you call a pure virtual? It's easy, and if you have at least a primitive understanding of how classes really work in C++, it's obvious.

A subclass in C++ is an instance of the base class with other stuff tacked on. It's even laid out that way in memory. Remember, all of this stuff is translated into pointer arithmetic when you compile it. When a subclass instance is created, the constructors are called in order down the inheritance chain. Imagine that class baz inherits from class bar, and bar inherits from class foo. Think of it like the rings of a tree: At runtime, a baz instance consists of a foo instance surrounded by an accretion of bar stuff, surrounded by an accretion of baz stuff. Likewise, when you instantiate baz, the foo constructor is called, then the bar constructor, and then the baz constructor. Here's the point: When the foo constructor is called, the object is not yet an instance of bar or baz. It is only foo. Now, if foo is an abstract class (that's a class with pure virtual member functions, remember), you cannot in principle instantiate it. However, for one brief moment while a subclass instance is being constructed, an instance of abstract class foo does in fact exist. Compilers aren't stupid enough to let you call a pure virtual directly from inside a constructor, but as of this writing, it's not common for them to look very deeply. You can call a "real function" which in turn calls a pure virtual, and the compiler won't make a fuss. When the code runs, it will try to call a nonexistent function and your program will crash.

Destruction of C++ class instances is a mirror image of construction. In our baz/bar/foo example, the baz destructor will be called, then the bar destructor, then finally the foo destructor. When you get back down to foo, all you've got is an instance of foo squatting admidst the forgotten wreckage of whatever belonged to the subclass instances. Once again, we have an instance of foo on our hands. Once again, the compiler won't let us call a pure virtual function directly, but we can still call one indirectly and crash the program.

So don't do that.

Here's an example of pure virtuals in action. We'll implement a very simple-minded input/output abstraction, with an abstract base class defining a simple, generalized IO interface. We'll then define two moderately useful concrete subclasses and write some pointless code that instantiates them and calls a few member functions. In real life, we'd use iostream classes from the C++ standard library, because they're standard and somebody wrote them already. Why would you want code like this? Imagine that you've got some very complicated data structure that you'd like to serialize to and from a disk file, or maybe to a buffer so you could put it in the clipboard (and from a buffer so you can get it back out), or maybe to somewhere else. You'd like to write the serialization code only once, so you use an IO abstraction. Your BigComplexThing class might have write() and load() member functions, which would handle all the details of serializing and unserializing. write() be given a reference to an output base class: When you call it, you might give it a reference to a subclass that writes to a disk file, or a subclass that writes to a buffer. The subclass might translate UTF-8 to Unicode or vice-versa; it could do anything. It might just grep the output for a particular string and ignore the rest. The possibilities are endless.

So here's some example code that compiles (or not, if you define DONT_COMPILE as nonzero). These classes assume that you're reading and writing strings.

//------------------------------------------------------------------
//  Pure virtual function example
//  2/17/2002
#include <stdio.h>

#define CRASH_ME        0
#define DONT_COMPILE    0

//  This is an "abstract class" because it has pure virtual
//  member functions.
class inout {
public:
    //  Pure virtuals: A base class could be getting the characters
    //  from anywhere, or putting them anywhere, so this base class
    //  cannot and should not address the details at all. In Java, 
    //  we'd write this as an "interface". The Java "interface" 
    //  feature is just a fancy way of saying "Jim Gosling tends to 
    //  go for very specific features where Bjarne Stroustrup would 
    //  prefer more generalized ones." Let's all shrug in unison.
    virtual int     get() = 0;
    virtual void    put( int c ) = 0;
    virtual bool    canget() = 0;
    virtual bool    canput() = 0;

    //  This is where it gets fun: We don't know or care where our 
    //  characters are coming from, but we can write higher-level 
    //  functions anyway. The base class only has to implement a
    //  small set of low-level IO functions. 
    virtual char *  gets( char * s, size_t size ) {
        int c = 0;
        for ( char * p = s; ! canget() && size > 0; --size )
            *p++ = get();
        *p = 0;
        return s;
    }

    //  Same deal. We don't care where the base class puts the 
    //  characters; the logic of stuffing a string into a hole is 
    //  the same regardless of where the hole leads.
    virtual void    puts( char * s ) {
        while ( *s && canput() )
            put( *s++ );
    }

    //  Constructor
    inout() {
#if ( CRASH_ME )
        //  Blammo! Pure virtual function call! Bad medicine.
        //  This calls the pure virtual itself because when you 
        //  instantiate a subclass, the base constructor is called 
        //  before the subclass constructor. At this point, our 
        //  object is not yet an instance of the subclass. It's 
        //  just an instance of inout. Therefore, the subclass 
        //  implementation of puts() does not yet "exist".
        puts( "constructor" );
#elif ( DONT_COMPILE )
        //  Directly calling a pure virtual is too damn obvious. 
        //  The compiler is likely to scold you and refuse to 
        //  compile. MSVC says put() is an unresolved external.
        //  That's just what it is, right?
        put( 'c' );
#endif
    }

    //  Destructor
    virtual ~inout() {
#if ( CRASH_ME )
        //  This is the same principle as in the constructor: When  
        //  the base class destructor is called, the object is no 
        //  longer an instance of the subclass. Once again, we're 
        //  calling a pure virtual function at a time when no 
        //  implementation of it "exists". We'll crash.
        puts( "destructor" );
#elif ( DONT_COMPILE )
        //  Ditto.
        put( 'c' );
#endif
    }

};


//	Write to stdout, read from stdout
class stdinout : public inout {
public:
    virtual int get() {
        return getchar();
    }

    virtual void put( int c ) {
        putchar( c );
    }

    virtual bool canget() {
        //  Some compilers get snippy about implicitly 
        //  casting int to bool. 
        return feof( stdin ) == 0;
    }

    virtual bool canput() {
        return true;
    }
};


//	Write to a string, read from a string.
class stringinout : public inout {
public:
    stringinout( char * s, size_t len )
        : str( s ), length( len ), offs( 0 )
    { /* stub */ }

    virtual int get() {
        return str[ offs++ ];
    }

    virtual void put( int c ) {
        str[ offs++ ] = c;
        str[ offs ] = 0;
    }

    virtual bool canget() {
        return offs < length - 1;
    }

    virtual bool canput() {
        return offs < length - 1;
    }

private:
    char *  str;
    size_t  offs;
    size_t  length;
};


int main()
{
    stdinout    sio;

    sio.puts( "foo\n" );

    char buf[ 3 ];
    stringinout strio( buf, sizeof( buf ) );

    strio.puts( "bar\n" );

    sio.puts( buf );

    return 0;
}

¹ ...and it should be.

virtual function	virtual	Eve, We Are Old (Eve 4)	vtable
I love you	C++	seal clubbing	implement
lateral fricative	That I Would Be Good	abstract method	disjunction
instantiate	interruption	pure	multiple inheritance
industrial	abstract	derive	Turtle
friend