void * malloc(size_t size) is part of libc, as dictated by the gods K&R. It's purpose, surprisingly enough, is to allocate memory.

It gets the memory space from the heap, sometimes refered to as the malloc arena. If there is not enough space in the heap, malloc() will make a system call asking the operating system to extend your program's memory space. On Linux, this system call is brk.

Note, however, that the system call to get more memory does not always succeed, namely when your system is out of virtual memory. In this case, malloc will have no memory to give you, and thus will return zero. As such, defensive programming dictates that you check the return value of malloc, lest ye dereference a null pointer and thus segfault.

When you're done with memory, as others have suggested, calling free() is highly advisable, unless you want a memory leak or are using some kind of garbage collection scheme. But don't free it twice- many malloc implementations rely on the programmer not doing this (which they shouldn't anyway) and as such doing it will corrupt the malloc arena, causing all manner of hell to break loose.

While malloc() may seem like a simple task, the actual implementation is quite complex, and seeing has how most programs rely on malloc, has been subject to quite a bit of research. Obviously, an E2 node is not the place to describe all of these issues in depth, here are a few:
  • coalescing - when two pieces of memory are freed, malloc implementations can either leave them alone or coalesce them into one larger chunk. There are benefits to both ways. On one hand, by treating them as one piece of memory, it may be able to allocate a larger object without needing to increase the memory segment size. This is a win. However, programs often repeatedly allocate the same sizes (e.g. the sizes of it's data structures) and as such this can be skipped, and it is likely that malloc will soon need a piece of memory that size.
  • contiguity - when the program allocates a lot of things sequentially, it is good for them to be allocated next to each other. Why? Suppose they make up some complex data structure like a linked list or tree. It's a good bet that the program will be accessing them at the same time. If they're near each other, the program will have to access fewer pages to get at them, and thus if they get paged out to disk, fewer page faults will occur.
  • mutex contention - in a multithreaded program, several threads may be trying to use malloc/free at the same time. Obviously it has to be thread safe, but depending on how cleverly it's implemented, there can be finer grained protection than simply only allowing one thread to allocate/deallocate at a time
  • speed - duh. all these things take time, and sometimes it's better to have a faster, slightly less optimal malloc.

  • malloc tips:
  • don't rely on rules like "every malloc should have a free" and "zero pointers when you free them" to save you from memory headaches. while in the simplest of cases these help, they fail in more complex ones.
  • when you're allocating many items that will be deallocated at the same time, aggregate malloc calls. in other words, if you know you're going to need 1000 structs which are 8 bytes long, instead of calling malloc(8) 1000 times, call malloc(8000) to get an array of them, then use them. Why? Malloc has lots of overhead, both in memory space and in cpu usage. If you're going to be done with those 1000 structs at the same time, malloc doesn't need to keep track of them individually. This can be a big speed win.

  • r00k123: malloc always returns contiguous memory, as does calloc. in fact, there's not really any such thing as non-contiguous memory; both return a void *, which is just a generic pointer. if there's not 'size' bytes of memory available starting at that pointer, bad things will happen. the only difference between malloc() and calloc() is that 1. malloc takes one argument, calloc takes two, 2. calloc zeroes the memory before it returns.