The ISO C construct for nonlocal gotos is a library facility which is defined in the header file <setjmp.h>. The macro setjmp is used to set a label (of some type jmpbuf_t), and the function longjmp can transfer control flow to that location. setjmp/longjmp existed well before ISO C or ANSI C. Although the standard is careful to avoid pointing at any specific implementation, the definition is tuned towards the very common stack-based implementation strategy. The facility is also available in C++, but its use there is highly inadvisable.

goto...

C's goto construct is a part of the language. There are severe limitations on where you can goto: You must stay within the same function, and you can only jump within the current block or an enclosing block. C++ adds even more limitations: you cannot goto past a constructor.

The reasons for these limitations should be clear. If you could goto another function, what would returning from that function do? What would be the contents of its automatic variables? Similarly, being able to jump into a subblock (or some other non-enclosing block) would be meaningless: Any looping or conditional constructs would be broken by such a move, and any automatic variables defined in the subblock cannot have well-defined contents.

As a part of the language, goto's semantics are precisely specified, and can be checked statically at compile time.

By living in the standard library, the setjmp/longjmp combo gets special treatment. Its use cannot be checked for correctness; instead, the standard puts the onus on the programmer to make sure the code is correct.

Nonlocality

setjmp/longjmp has very few syntactic restrictions: both are macros, with setjmp returning a value -- that's about all the compiler can check. You can setjmp a label in one block in one function, and longjmp to it from any function called by that block. Note that setting the label by setjmp occurs at run time (not at compile time), so it makes sense to talk about run-time considerations such as the call stack! Of course, the standard is careful to say nothing about the implementation using a stack; a non stack-based implementation could be conforming.

Calling setjmp is a bit like calling fork(): it returns to your code more than once.

  • You call setjmp, passing it its argument (of some opaque type jmpbuf_t), and the first time it returns 0. About the only thing you're allowed to do with this return value is store it in some variable or use it in some program flow control construct (that really means just "if", "while", and the like...).

    Your program can recognize this return code and proceed with what it meant to do. Presumably, you pass the jmpbuf_t argument "inwards" to functions lower down the call stack.

  • As stated, you may call longjmp with the jmpbuf_t from anywhere, as long as the block containing the setjmp hasn't finished executing. Note that this is a run-time limitation, and as such cannot be checked by the compiler!

    You pass longjmp the jmpbuf_t, containing the stored "label" (actually all the information about local variables needed to "rewind" back to the setjmp), as well as an integer return code. Now setjmp returns again, at the same spot as before, but with the new return code. Due to its use on the first return from setjmp, you cannot pass a return code of 0 to longjmp; if you try, it will be translated to 1.

    Your program actually resumes executing from the setjmp, only it now returns a new value! You're not guaranteed a function call due to the longjmp, though, so it is not portable to assume any values of non-volatile variables which have changed since the last sequence point. You identify this return by the nonzero value, and can then proceed to do something completely different.

The typical use of setjmp/longjmp is around a deeply nested main loop, particularly when it resides over some library over which you have no control: yacc (or bison or friends), or any GUI's event loop, which calls an event handler you wrote, which wants to return an error code. When discovering an error inside a routine called from the main loop, it can be very hard to report the error or to recover from it. Every routine needs to return an error indication to its caller, to prevent further processing. Doing this can be very hard or even impossible, depending on the structure of the library's main loop. A C++ method could be to throw an exception; the C method is to set some global variable to the appropriate error context and longjmp out of the loop. The resulting return value from setjmp signals your code (outside the main loop) that an error indication should be processed. Typically some indication of the error is emitted to the user (or caller), and the main loop re-initialized and called again -- to continue processing.

Leaking Resources

A major problem with longjmp is the ease of leaks. When setjmp is called again, the main loop may well have allocated memory, opened files, and generally used some resources. These will typically have to be freed -- or at least not "lost" by the application. Often a great deal of design is needed merely to allow a safe longjmp! (E.g. the Apache (or APR) architecture uses "pools" for tracking all allocations, so freeing all resources used is greatly simplified.). Note that memory allocated using alloca() is, of course, automatically freed, since the stack is "rewound" past it. However there are many other problems with alloca()...

If longjmping through C++ code, no destructors will be called. This is very different from throwing an exception: longjmp takes constant time, throw iterates through all stack frames. On the other hand, this property may be desirable. It's a completely different mechanism from exception handling.

Log in or register to write something here or to contact authors.