I found a weird little oddity in the Microsoft Foundation Classes library, a.k.a. MFC. I happen to think it's a lousy idea in practice, but it's clever, and it's slightly interesting.

In MFC's string class, CString, there is exactly one data member: A char *. Nevertheless, the thing keeps track of the length of the string and of how much memory is allocated (incidentally, they use new[] and delete[] rather than realloc, so resizing probably isn't as efficient as it might be). It maintains a reference count as well, since CString has "pointer semantics" rather than "value semantics". Or whatever you call it.

Here's what's going on: They define a struct, struct CStringData. That struct has ints for reference count, allocated size, and string length. When the time comes to copy a string into a CString instance, they allocate the memory needed for the string, plus the size of struct CStringData. They cast that to (struct CStringData *) and assign to the members appropriately. They then put the string data, the characters themselves, after that. The single char * in class CString points to the first character, with our struct CStringData instance dangling off the back. There's a member function (CStringData* CString::GetData() const;) which does the subtraction and returns a pointer to the struct CStringData, and they do pointer fiddling as neccessary elsewhere.

By now, you're either way ahead of me, or you've long since stopped reading, or you're marvelling at the bizarre behavior of Microsoft's programmers. It's not as dumb as it seems, actually. Dumb, very dumb, but not perfectly dumb. Here's why:

    CString  cs = "an arbitrary string";

    sprintf( s, "foo bar %s baz", cs );

The third and subsequent arguments to sprintf have no type, therefore no cast operator will be called. What will happen is that all of object cs will be copied onto the stack. If the first four bytes of cs are an integer equal to, say, 47, you'll be in trouble. But if an ignorant or inattentive programmer makes this mistake with a CString instance, it's no problem at all, because CString's only member is a pointer to a string. That'll go on the stack as-is, and everything will go smoothly. This is how I learned about the whole deal: I saw a place in some old code where I'd done just that, but the code worked perfectly. It freaked me out: That code had no excuse for working. So I RTFS, and there it was. Having grokked, I added a damn cast, even though it wasn't strictly needed.

Here's why the whole thing is still a dumb idea, even though there's a method to the madness:

  1. It's not clever to teach novice programmers to assume that they can feed a class instance to sprintf and get away with it, because that's only likely to be true for this one class, and the reason is not exactly obvious to a novice.

  2. It's bad policy for the behavior of a class to depend on the arrangement of an instance's bytes in memory, because some bright fella may come along and rearrange it at a later date: Maintenance happens.

  3. It's inconsistent with the rest of MFC: They're addicted to asserts. I'm okay with that. They go overboard sometimes, but they're on the right track. But if you're going to be persnickety and go around asserting things most of the time, don't turn around and go overboard shielding programmers from their own carelessness every once in a while. That's just goofy. Pick one approach, and stick with it.

This is what bothers me most about MFC: They get clever for no good reason but to demonstrate how clever they are, and then they blow off common-sense stuff like abstraction, encapsulation, and whatnot. It's shockingly unprofessional in many ways.


C:\...\DevStudio6\VC98\MFC\Include\afx.h, 'round line 356 or so;
C:\...\DevStudio6\VC98\MFC\src\strcore.cpp, starting line 51.