This is what you look at after a router has crashed owing to a hardware or more often a software error. It can only be of help if you know what the hell it is that you're looking at. The stack trace is a list of active subroutine calls in the current process and includes the PC (program counter) address where the problem originated as well as PCs in superior routines. The symbol table associated with the router's image (IOS version) can translate the PC values in the stack into routine names and offsets, which can then be used to locate lines of source code. The solution is usually the installation of a newer IOS and/or more memory. NOTE: For release 11.3(1) through 11.3(2), the stack trace fails to report the correct version of code as of the last reboot if the Flash doesn't match the current running version of code.

I'll use Everything 2 for an example here for the sake of familiarity, but this applies to virtually any modern software.

When we do structured programming, functional programming, or object-oriented programming -- indeed any paradigm that's really useful -- we write "subroutines" (a.k.a. "functions" in many languages). A subroutine is a blob of code somewhere which performs a given task. It could be a simple task or a complex one; in fact, it can break its task down into simpler tasks and call other subroutines to perform those. That's why subroutines are neat: Maybe "getNodeTitle" is the subroutine in E2 that looks up a node in the database and gets the node's title -- it doesn't matter who needs to know that information, and it doesn't matter why: You tell getNodeTitle which node you're interested in, and getNodeTitle digs up the information for that node and hands it over. Anybody who wants a node title just goes and asks getNodeTitle, so they don't have to worry about all the details of how getNodeTitle does its thing.

So: Subroutines call other subroutines, which in turn call yet other subroutines, "and so ad infinitum."

You could draw the execution process as something like a tree:

    Start with subroutine A;
        Go to subroutine B;
            Go to subroutine C;
            Back up to subroutine B;
        Back up to subroutine A;        
    Go to subroutine D;
        Go to subroutine E;
            Go to subroutine F;
            Back up to subroutine E;
        Back up to subroutine D;
        Go to subroutine G;
        Back up to subroutine D;
    Back up to subroutine A.

...and so on.

At any point during all this running around, you can trace a "straight line" back to your starting point: You were in A, then you went to D, then into E, now you're in F. In simple terms, leaving out a few details, that's the "stack". A "stack trace" at the point would look like this:

sub F
sub E
sub D
sub A

After you're done with F and you back up into E, it would look like this:

sub E
sub D
sub A

The reason why this is interesting is that in practice, you could jump into E from anywhere in your program. Maybe you got there from W, maybe you got there from R. You could have gotten to R from Z or from U. Once you're inside one of these things, you really don't know how you got there. Think back to getNodeTitle up above: getNodeTitle doesn't care who needs the node's title. That's not its job.

But what happens if something goes wrong? Suppose some code somewhere tells getNodeTitle to get a title for a node which doesn't exist. getNodeTitle fails, and the user sees a server error -- unless the user is a god, in which case the user sees a stack trace. This is important because the code which called on getNodeTitle is where the error is. As I've said, that code could be in any one of dozens or even hundreds of places on the system. If all you know is "something went wrong in getNodeTitle", you could be in for a very long night. To fix a problem, a programmer usually needs to know how the program got to where it was when it ran into trouble. The stack trace tells the programmer which code called getNodeTitle. Maybe it looks a bit like this (again, we're leaving out a lot of detail):

getNodeTitle
getNodeInformation
showNewWriteups
createNodelets
createPage

Now the programmer can look in getNodeInformation and see what's going on there. Maybe there's nothing wrong; in that case, he knows that showNewWriteups is the next place to look, and so on back up the chain.

Most modern programming environments have "debuggers" which will let you stop the execution of a program in midstream and get a sort of X-ray view of what it's doing at the moment. Any debugger worth its salt will let you see a stack trace. Here's a real-life example from Microsoft's C++ environment. It gives you the names of the subroutines (in C++ they're called functions), it tells you where they are in the source code ("line 223"), and it tells you what information has been "passed" to them by the other subroutines which called them ("unsigned int 15" -- an integer which can't be negative, and which happens at the moment to be equal to 15):

  CDigitWidget::OnPaint() line 223
  CWnd::OnWndMsg(unsigned int 15, unsigned int 0, long 0, long * 0x0012f638) 
      line 1825
  CWnd::WindowProc(unsigned int 15, unsigned int 0, long 0) line 1585 + 30 bytes
  CDisplayWidget::WindowProc(unsigned int 15, unsigned int 0, long 0) line 237
  AfxCallWndProc(CWnd * 0x00b863e8 {CDigitWidget hWnd=0x00bb1238}, HWND__ * 
      0x00bb1238, unsigned int 15, unsigned int 0, long 0) line 215 + 26 bytes
  AfxWndProc(HWND__ * 0x00bb1238, unsigned int 15, unsigned int 0, long 0) 
      line 368
  USER32! 77e71303()
  USER32! 77e71962()
  NTDLL! 77f763ef()

The "USER32! 77e71303()" gibberish refers to subroutines which are part of the operating system. The debugger doesn't know their names because it doesn't have access to their source code. Microsoft's lawyers like it that way. When we released the program the example is from, we didn't let anybody see the source code either: Our lawyers like it that way, too.

There you have it: When something goes wrong, a stack trace is your friend. It doesn't exactly tell you what went wrong, but it does tell you where to start looking.



I ain't no computer scientist; I'm just a humble programmer, and would be glad to hear any corrections/suggestions from knowledgeable parties. As for the wildly schematic notion of how the Everything Engine works, it's just that: Schematic. This is not Everything documentation.

Log in or register to write something here or to contact authors.