We will now discuss a small corner of the standard library of the C programming language.

The vprintf() "family" of functions is an odd branch of the printf() family. All of these functions are used to convert intrinsic C data types into strings, with formatting dictated by the caller. The precise details of that conversion are far too elaborate to describe here.

When you look at the number of options available in a printf()-family format string, it looks pretty daunting to think about implementing several of these things. In particular, it seems idiotic to write both printf() and fprintf() as separate functions: After all, printf() does exactly the same thing as fprintf() except that it always does it to the stdout stream.

You'd like to do something like this when you implement printf():

    int printf( const char * format, ... )
    {
        fprintf( stdout, format, ... );
    }

You'd like to do that, but you can't, because that ... doesn't really mean anything. It tells the compiler not to worry about the number and type of arguments that appear there, but that's all it does. You can't just go and pass ... to another function. However, if we're familiar with stdarg.h, we know that there is a way to get a grip on a list of function arguments whose number and type are not known. This is where vprintf() and friends come into the picture (man vprintf if you've got a UNIX/Linux box handy):

    int vprintf( const char * format, va_list ap );
    int vfprintf( FILE * stream, const char * format, va_list ap );
    int vsprintf( char * str, const char * format, va_list ap );
    int vsnprintf( char * str, size_t size, const char *format, va_list ap );

You'll notice that these correspond to the members of the main branch of the printf family, except that where you expect to see ..., you see va_list ap instead (that's "ap" for "argument pointer", I guess). sprintf() copies its output string to another string; fprintf() barfs it into an output stream. The 'n' in vsnprintf() means the same thing as the 'n' in strncpy(): Stop after that many characters, so as to guard against buffer overflows. snprintf() and vsnprintf() are not standard, by the way, but GCC and MSVC both support them, so what the hell.

This makes it relatively easy to call one variable-argument function from another:

    int printf( const char * format, ... )
    {
        int rtn = 0;

        va_list args;

        va_start( args, format );

        rtn = vfprintf( stdout, format, args );

        va_end( args );

        return rtn;
    }

With a little extra fussing around (locking the stream to be thread safe, for example), that's essentially what your compiler's runtime library is probably doing. As far as Microsoft's C runtime goes, I can guarantee it, because they ship the source with the compiler and I just checked1. The other computer on my desk is a Linux box, but, ironically, it doesn't seem to have the glibc source installed.

You can always get a better handle on these things when you try to use them, so let's try an example. It's hard to think of one, because we haven't written very many C runtime libraries lately. One obvious use might be adding a format() member function to a string class, even though most of us don't write very many of those either:

    #include <stdio.h>
    #include <stdarg.h>

    ...

    //  Format as with sprintf()
    strcls & strcls::format( const char * fmt, ... )
    {
        //  We're paranoid, but not paranoid enough.
        const size_t bufsize = 4096;

        resize( bufsize );

        va_list args;

        va_start( args, fmt );

        //  strcls::data is our pointer to the string we contain.
        //  We have to call it something.
        int len = vsnprintf( data, bufsize, fmt, args );

        va_end( args );

        //  Discard unused portion
        if ( len + 1 < bufsize )
            resize( len + 1 );

        return *this;
    }

We use vsnprintf() for buffer overflow safety. It's a good idea, but the problem spills over into another arena: We don't want to run up against that limit if we can possibly help it. We have a choice between allocating an insane amount of memory and possibly thrashing, or else allocating a piddly little amount of memory (4k is piddly) and possibly having less than we need. In most cases, we won't need very much at all: Maybe a hundred bytes or so. It's nutty to be allocating these huge swaths of memory and then freeing 99.5% of it an instant later. We'd like to know how much memory we really need, and allocate accordingly. We can guess based on the length of the format string, but that's idiotic because the length of the format string alone doesn't tell us anything: "%65636d" is a legal format string. It is seven characters long, but it produces more than sixty-five thousand characters of output. Therefore, the length doesn't tell you anything useful. This leaves us with several options, all rotten:

  1. Ask the caller how many bytes it expects to need. That's an annoyance for the caller, and it'll end in tears anyway because you can't trust that bastard.

  2. Calculate the required length in a convenient but insane way: You could open a temp file, and call vfprintf() to write your output to that file. printf() and friends all return the number of characters in the result string, so this would work. The problem is that it's pathologically ugly and inefficient: File IO is sluggish stuff, guys.

  3. Parse the format string and calculate the required size yourself. We've done this, and it's one of those things we're glad to have done once. It's a major pain in the ass, and it's depressing to know that the runtime will do it all over again an instant later.

    MFC's CString::Format() does this. If you've got a copy of MSVC handy with the MFC source installed, have a look at CString::FormatV() in %MSVCDIR%\mfc\src\strex.cpp (line 432 in the version of MFC that ships with MSVC 6). You'll see the horrible truth for yourself. Our approach would be to put all that logic in a function like so:

        int vfmtlen( const char * format, va_list argp );
    
    

    We'd then call that on the rare occasions when we need to go there. Yes, you need the arguments themselves: How long is a string? The only way to know is to have a look.

    It's all a big ugly hassle, but once you've got a format() member for a string class, it's enormously handy to have around.


We conclude that these 'v' functions are a slick notion, but that in practice they tend not to be as useful and convenient as one might hope.




1 It's even weirder, actually: All of their 'v' functions call a function called _output():

    //  %MSVCDIR%\crt\src\output.c:325:
    //  (truthfully, I've untangled this from a thicket of #ifdefs)

    int _output( FILE * stream, const TCHAR * format, va_list argptr );

TCHAR is a macro which hides whether you're dealing with char or wchar_t. Now here's the creepy bit: That FILE * can be used to write output to a string instead of a stream:

    //  %MSVCDIR%\crt\src\vsprintf.c:93:

    outfile->_flag = _IOWRT|_IOSTRG;
    outfile->_ptr = outfile->_base = string;

We presume that the buffered output functions know what those flags mean and do the right thing. None of this stuff is documented, so don't rely on it.

Log in or register to write something here or to contact authors.