dis is an amusing little module provided as part of
the standard library of the Python programming language. The name
is short for disassemble, and it provides
functionality similar to that of machine code disassemblers (rarely used
anymore except by crackers).
Situations calling for the disassembly of Python bytecode are
truly rare, but it can provide some insight
into how Python works, deep down. The most obvious thing you notice
when you see Python bytecode is how high level it is. Contrast it with,
for example, Java bytecode, which is quite similar to the
instructions of a hardware CPU. Python's instuction set, while giving
away the fact that the Python virtual machine is stack based, mostly
consists of higher level operations which mostly map directly to your
understanding of Python's mechanics, such as resolving names and
preparing exception handlers.
Disassembling some random
code when I first learned of the module, it was driven home to me
how dynamic Python is; I mean, I knew it was dynamic, but here was
proof in black and white.
One of the major problems
facing persons involved in the study of
disassembled CPU code is that many of the references to data which
had such clever, entertaining, and hopefully helpful names in the source
code, now are reduced to meaningless numbers, such as offsets from
stack or other pointers and absolute memory addresses. The dynamism
of Python prevents that problem: every name the programmer typed in
is still there -- every module name, function name, class name, variable
name, object attribute name, you name it.
Looking at the code involved in a common statement such as
sys.stderr.write(string.join(errors))
makes you really glad that computers are fast, as you see the chains
of LOAD_ATTR opcodes, each showing the name
that will need to be looked up in a dictionary.
The only time I really needed dis, though, was when
I was writing a multi-threaded program and wanted to see if the
operation of adding two lists together is performed atomically.
(Well, I could have dug through the source code of Python, but this
was much easier.) I'll leave it to you to use dis
to find the answer to that question yourself, if you're
curious.
The workhorse in the module is the disassemble
function (also known as disco). This function disassembles
a supplied code object and prints the code to sys.stdout. But the convenience
function that is normally used is dis.dis, which accepts
a code object, a function, a method, or a class, and disassembles them all
(essentially finding all the code objects within the supplied object and calling
disassemble for each one).
To whet your appetite, here is the disassembly of the code fragment I
showed above:
0 SET_LINENO 1
3 SET_LINENO 2
6 BUILD_LIST 0
9 STORE_FAST 0 (errors)
12 SET_LINENO 3
15 LOAD_GLOBAL 1 (sys)
18 LOAD_ATTR 2 (stderr)
21 LOAD_ATTR 3 (write)
24 LOAD_GLOBAL 4 (string)
27 LOAD_ATTR 5 (join)
30 LOAD_FAST 0 (errors)
33 CALL_FUNCTION 1
36 CALL_FUNCTION 1
39 POP_TOP
40 LOAD_CONST 0 (None)
43 RETURN_VALUE