Strings in C are funny things. They are really arrays of characters. In most low-level computer languages (such as C), an array is merely a region of memory where similar objects reside, side by side. One property of arrays is quick random access and relatively expensive re-ordering of elements. (Linked lists, on the other hand, are other data structures that tradeoff random access for quick-reordering. Linked lists basically are fast to reorder, but relatively slow when accessing the arbitrary nth element.)

As such, the following string:

"everything2.com"

In C is in fact an array of characters that are side by side in memory and spell out "everything2.com". This array is a memory region starting at the memory location where the first 'e' byte resides.

One interesting property of C-like strings is that they always have a beginning but no fixed end. So if I have the declaration:

char *name = "My Name Is Calin";

In my program, how does a function like printf() know that it should begin outputting characters to standard output starting at the first 'M' and continue outputting characters until it gets to the last 'n', and only after that point, stop?

Well the first part is easy. The 'M' has an address which you pass to printf() when you call it. That address is stored in that variable I declared, called 'name'. The hard part is how does printf() know that it needs to stop only after the last character is printed and not before and not after? Basically, how does it know the size of my array?

The answer, my friends, is blowing in the wind. No, seriously, the answer lies in the NULL teminator byte (someone pointed out that this should be called NUL as to avoid confusion with NULL or ((void *)0).. but I insist that they can be used interchangably since they evaluate to the same binary value, that is, 0... :D ). All C-style strings are appended a NULL terminator byte (a character whose ascii value is zero). This is often written as 0 in integer form and '\0' in character form.

This is an imperfect solution, but very fast. Imperfections come into play when one forgets to put the trailing '\0' at the end of a string. Very bad things can happen if you pass printf() a string that has no null byte.

Alternatives would have been probably more robust but would have traded off speed and simplicity in favor of optimizing for the not so common case.

Complaints about this can go to Dennis Ritchie or maybe even Ken Thompson or Brian Kernighan. :)

Log in or register to write something here or to contact authors.