Some would argue that the most elusive type of problem in a C program is the fact that it's written in C.

Problems that are in some ways endemic to C programs:

  • Memory leaks. This has to be, by far, the subtlest and hardest to fix. ``Easy enough'', you say, ``make sure you free everything you allocate.'' That works in the simple cases. In more complicated situations, it's just not possible. The problem is often in shared data structures---in a language without GC, it becomes important to have a good ownership protocol, and that's hard to do.
  • Buffer overflows. I don't think I need to even go into this.
  • Fixed-size buffers. This is a similar problem. So you found that buffer overflow, and replaced gets(buff) with fgets(buff, 80, stdin). The only problem is, now your program silently truncates lines longer than 79 characters. Lose, lose.
Problems that aren't C-specific: Solutions? There's no panacea. GC will correct many, but not all, memory leaks---and may introduce more if you forget to break links. Using a safer string or vector library will fix many buffer overflows (as well as arbitrary buffer-size limits), but there are many cases where you need, for whatever reason, to muck around with pointers or C arrays. As for the non-C-specific problems, they can only be fixed through brainwashing. Programmers will always write bad programs, no matter what tools they are given.

neil: as for fixed-sized buffers:
char *buf = NULL;
size_t len = 0;
int c;

while( (c = getchar()) != EOF ) { 
   buf = realloc( buf, ++len );
   buf[len-1] = c;
}
buf = realloc( buf, len+1 );
buf[len] = 0;
No loss. That's what realloc is there for. Just don't forget to free().

Speaking of realloc, one that's tricky is stale pointers with that. You realloc something, you have to re-assign each pointer to it; this can be difficult in a threaded program.

NULL pointer stuff is pretty obvious, so I won't go into that. There's also the problem of never assigning a value to a pointer, then trying to dereference it. That's pretty easy to track and fix once you find it though. There's also accidentally casting an int or some such into a pointer (ouch!), but again, that's pretty easy to track. I'd say it's definitely forgetting to free(). The rest can be fixed with a good debugger.

Core-dumping show stoppers are usually easy to locate with the debugger and some decent detective work. After all, you have an event that provides a built in break point for your debugging. Memory leaks are just evil to find and fix in most cases. You start with little to no clue about where the problem is. You have to step through much of the code to even start to have a hope of finding the problem.

This sort of shite is one of the reasons I use C++. It isn't perfect, and you still use pointers enough to get into trouble, but it reduces their use to the point where you spend a lot less time groveling around for that leaking memory. Got bless destructors!

So C isn't perfect. The hardest types of errors to track down are compiler errors, and trust me, they do exist.
How do you get out of the aforementioned errors? Listen up:
Syntax:
  • = vs. ==: Common simple mistake. Hardly even worth making more than once, but it happens all the time, especially to people who type quickly. How do you get around it? Get in your head to think like this:
       if(0 == myVar)

    Get people to stick the number on the left. What happens when you put a single equals sign? "Invalid L-value", right there, in your face, in the compiler, problem squashed.


Binary:
  • Memory leaks: Perform memory sub allocation, or do your own memory management. Think of it like matter in chemical reactions: Memory can neither be created nor destroyed. You shouldn't have more than zero when you close down, and you shouldn't try to get rid of something that shouldn't be. Overload new (if you are a c++ junkie), or make a debugging malloc macro that writes down how much memory you are malloc-ing and add it to a global variable. If you end up with too much ( > 0)left over, get rid of it, or throw an assertion. Using a memory sub allocation library to do all of your memory allocation (like to do it in 4k chunks for performance on a paged system), would do you well, but it a little over the top for some applications.
  • Crashes: GDB, VS Debugger, WinDbg, these are all your friends. Set breakpoints, trace, watch your locals.
  • Buffer overflows: It's definately hard to spot a place where buffer overflows can happen. A lot of ways people check uncharted memory is to null it out with something first, something that would show up really well in a debugger. We used to try to come up with new and interesting ones, like DEADBEEF or BADF000D. All in all, you need to use some kind of protection on your input, and be dead careful. This is something that plagues the industry, because buffer overflows happen (it's a weakness in C, since it is compiled code, and doesn't have language support for flex buffering).


The worst errors you can run into are the compiler or API errors. For compilers, they are usually memory or optimization related, and have to do with when you have a massively complicated program you are writing. Why are they terrible? Because it's hard to prove it's the compiler. It's your code vs. theirs. API bugs are bad, because the point of most APIs is to keep the user away from the lower levels of the program. (Bummer, it died in WinExec? What the heck is that?!?).

C is as good as any other compiled language. It has it's ups and downs. When working at Microsoft, a friend of mine always complained in that he had to work with beta software, on a beta system, with a beta compiler. In those situations, every little line can send your program burning down in flames. Systems reach an entropy point, and sometimes I used to felt that we always teetered on the edge of that. If only "oh I forgot an equals sign", or "oh darn, off by one, forgot about the null terminator" was the cause of most of these problems

There is, without a doubt, nothing as infuriating as memory management problems in C or C++ - certainly enough to make me long for Java or Scheme or some other such language with garbage collection facilities. At least in C++, the potential exists for reference counting (see the C++ FAQ for info), so that memory holes may be made to take care of itself.

One major problem I had (and still do occasionally) concerned the terminating null on strings, which I would routinely forget to allocate or properly reallocate, with disastrous results. ("Dammit, why is free() segfaulting?! There must be a bug in glibc!")

A note about = vs ==, a good compiler should be able to catch this one. For instance, CodeWarrior (one of the two major MacOS compilers) will warn you if you do something like if( a = 5 ) { do_stuff(); }. However, it's fine with if( (a = 5) ) { do_stuff(); }, so you won't have to wade through warning messages when you do want to assign a value.

Similarly, it will pick up a == 5; without an if, but I don't think too many people make that error.

As far as the most elusive type of problem goes, I have to concur with the general conclusion here: memory leaks. An unassigned pointer or buffer overflow crashes sooner or later (probably sooner), and that tends to give you a good idea of what is causing it and where it is. Hunting it down tends to be fairly direct, if time consuming, because you know where to look. They can be nasty as hell, but they aren't usually very elusive.

Memory leaks, on the other hand, don't give you any warning. Unless you specifically look for them by keeping track of allocated and freed memory, you won't even know they exist. Once you do determine that you're leaking memory, you really have no idea where to start looking. They aren't impossible to fix or particularly nasty, but they are elusive.

Log in or register to write something here or to contact authors.