33.12. dis — Python bytecode的反組譯器

原始碼:Lib/dis.py


dis 模組支援反組譯分析 CPython bytecode。CPython bytecode 作為輸入的模組被定義於 Include/opcode.h 並且被編譯器和直譯器所使用。

CPython implementation detail: Bytecode is an implementation detail of the CPython interpreter. No guarantees are made that bytecode will not be added, removed, or changed between versions of Python. Use of this module should not be considered to work across Python VMs or Python releases.

3.6 版更變: Use 2 bytes for each instruction. Previously the number of bytes varied by instruction.

Example: Given the function myfunc():

def myfunc(alist):
    return len(alist)

the following command can be used to display the disassembly of myfunc():

>>> dis.dis(myfunc)
  2           0 LOAD_GLOBAL              0 (len)
              2 LOAD_FAST                0 (alist)
              4 CALL_FUNCTION            1
              6 RETURN_VALUE

(The 「2」 is a line number).

33.12.1. Bytecode analysis

3.4 版新加入.

The bytecode analysis API allows pieces of Python code to be wrapped in a Bytecode object that provides easy access to details of the compiled code.

class dis.Bytecode(x, *, first_line=None, current_offset=None)

Analyse the bytecode corresponding to a function, generator, asynchronous generator, coroutine, method, string of source code, or a code object (as returned by compile()).

This is a convenience wrapper around many of the functions listed below, most notably get_instructions(), as iterating over a Bytecode instance yields the bytecode operations as Instruction instances.

If first_line is not None, it indicates the line number that should be reported for the first source line in the disassembled code. Otherwise, the source line information (if any) is taken directly from the disassembled code object.

If current_offset is not None, it refers to an instruction offset in the disassembled code. Setting this means dis() will display a 「current instruction」 marker against the specified opcode.

classmethod from_traceback(tb)

Construct a Bytecode instance from the given traceback, setting current_offset to the instruction responsible for the exception.

codeobj

The compiled code object.

first_line

The first source line of the code object (if available)

dis()

Return a formatted view of the bytecode operations (the same as printed by dis.dis(), but returned as a multi-line string).

info()

Return a formatted multi-line string with detailed information about the code object, like code_info().

3.7 版更變: This can now handle coroutine and asynchronous generator objects.

Example:

>>> bytecode = dis.Bytecode(myfunc)
>>> for instr in bytecode:
...     print(instr.opname)
...
LOAD_GLOBAL
LOAD_FAST
CALL_FUNCTION
RETURN_VALUE

33.12.2. Analysis functions

The dis module also defines the following analysis functions that convert the input directly to the desired output. They can be useful if only a single operation is being performed, so the intermediate analysis object isn’t useful:

dis.code_info(x)

Return a formatted multi-line string with detailed code object information for the supplied function, generator, asynchronous generator, coroutine, method, source code string or code object.

Note that the exact contents of code info strings are highly implementation dependent and they may change arbitrarily across Python VMs or Python releases.

3.2 版新加入.

3.7 版更變: This can now handle coroutine and asynchronous generator objects.

dis.show_code(x, *, file=None)

Print detailed code object information for the supplied function, method, source code string or code object to file (or sys.stdout if file is not specified).

This is a convenient shorthand for print(code_info(x), file=file), intended for interactive exploration at the interpreter prompt.

3.2 版新加入.

3.4 版更變: Added file parameter.

dis.dis(x=None, *, file=None, depth=None)

Disassemble the x object. x can denote either a module, a class, a method, a function, a generator, an asynchronous generator, a couroutine, a code object, a string of source code or a byte sequence of raw bytecode. For a module, it disassembles all functions. For a class, it disassembles all methods (including class and static methods). For a code object or sequence of raw bytecode, it prints one line per bytecode instruction. It also recursively disassembles nested code objects (the code of comprehensions, generator expressions and nested functions, and the code used for building nested classes). Strings are first compiled to code objects with the compile() built-in function before being disassembled. If no object is provided, this function disassembles the last traceback.

The disassembly is written as text to the supplied file argument if provided and to sys.stdout otherwise.

The maximal depth of recursion is limited by depth unless it is None. depth=0 means no recursion.

3.4 版更變: Added file parameter.

3.7 版更變: Implemented recursive disassembling and added depth parameter.

3.7 版更變: This can now handle coroutine and asynchronous generator objects.

dis.distb(tb=None, *, file=None)

Disassemble the top-of-stack function of a traceback, using the last traceback if none was passed. The instruction causing the exception is indicated.

The disassembly is written as text to the supplied file argument if provided and to sys.stdout otherwise.

3.4 版更變: Added file parameter.

dis.disassemble(code, lasti=-1, *, file=None)
dis.disco(code, lasti=-1, *, file=None)

Disassemble a code object, indicating the last instruction if lasti was provided. The output is divided in the following columns:

  1. the line number, for the first instruction of each line
  2. the current instruction, indicated as -->,
  3. a labelled instruction, indicated with >>,
  4. the address of the instruction,
  5. the operation code name,
  6. operation parameters, and
  7. interpretation of the parameters in parentheses.

The parameter interpretation recognizes local and global variable names, constant values, branch targets, and compare operators.

The disassembly is written as text to the supplied file argument if provided and to sys.stdout otherwise.

3.4 版更變: Added file parameter.

dis.get_instructions(x, *, first_line=None)

Return an iterator over the instructions in the supplied function, method, source code string or code object.

The iterator generates a series of Instruction named tuples giving the details of each operation in the supplied code.

If first_line is not None, it indicates the line number that should be reported for the first source line in the disassembled code. Otherwise, the source line information (if any) is taken directly from the disassembled code object.

3.4 版新加入.

dis.findlinestarts(code)

This generator function uses the co_firstlineno and co_lnotab attributes of the code object code to find the offsets which are starts of lines in the source code. They are generated as (offset, lineno) pairs. See Objects/lnotab_notes.txt for the co_lnotab format and how to decode it.

3.6 版更變: Line numbers can be decreasing. Before, they were always increasing.

dis.findlabels(code)

Detect all offsets in the code object code which are jump targets, and return a list of these offsets.

dis.stack_effect(opcode[, oparg])

Compute the stack effect of opcode with argument oparg.

3.4 版新加入.

33.12.3. Python Bytecode Instructions

The get_instructions() function and Bytecode class provide details of bytecode instructions as Instruction instances:

class dis.Instruction

Details for a bytecode operation

opcode

numeric code for operation, corresponding to the opcode values listed below and the bytecode values in the Opcode collections.

opname

human readable name for operation

arg

numeric argument to operation (if any), otherwise None

argval

resolved arg value (if known), otherwise same as arg

argrepr

human readable description of operation argument

offset

start index of operation within bytecode sequence

starts_line

line started by this opcode (if any), otherwise None

is_jump_target

True if other code jumps to here, otherwise False

3.4 版新加入.

The Python compiler currently generates the following bytecode instructions.

General instructions

NOP

Do nothing code. Used as a placeholder by the bytecode optimizer.

POP_TOP

Removes the top-of-stack (TOS) item.

ROT_TWO

Swaps the two top-most stack items.

ROT_THREE

Lifts second and third stack item one position up, moves top down to position three.

DUP_TOP

Duplicates the reference on top of the stack.

3.2 版新加入.

DUP_TOP_TWO

Duplicates the two references on top of the stack, leaving them in the same order.

3.2 版新加入.

Unary operations

Unary operations take the top of the stack, apply the operation, and push the result back on the stack.

UNARY_POSITIVE

Implements TOS = +TOS.

UNARY_NEGATIVE

Implements TOS = -TOS.

UNARY_NOT

Implements TOS = not TOS.

UNARY_INVERT

Implements TOS = ~TOS.

GET_ITER

Implements TOS = iter(TOS).

GET_YIELD_FROM_ITER

If TOS is a generator iterator or coroutine object it is left as is. Otherwise, implements TOS = iter(TOS).

3.5 版新加入.

Binary operations

Binary operations remove the top of the stack (TOS) and the second top-most stack item (TOS1) from the stack. They perform the operation, and put the result back on the stack.

BINARY_POWER

Implements TOS = TOS1 ** TOS.

BINARY_MULTIPLY

Implements TOS = TOS1 * TOS.

BINARY_MATRIX_MULTIPLY

Implements TOS = TOS1 @ TOS.

3.5 版新加入.

BINARY_FLOOR_DIVIDE

Implements TOS = TOS1 // TOS.

BINARY_TRUE_DIVIDE

Implements TOS = TOS1 / TOS.

BINARY_MODULO

Implements TOS = TOS1 % TOS.

BINARY_ADD

Implements TOS = TOS1 + TOS.

BINARY_SUBTRACT

Implements TOS = TOS1 - TOS.

BINARY_SUBSCR

Implements TOS = TOS1[TOS].

BINARY_LSHIFT

Implements TOS = TOS1 << TOS.

BINARY_RSHIFT

Implements TOS = TOS1 >> TOS.

BINARY_AND

Implements TOS = TOS1 & TOS.

BINARY_XOR

Implements TOS = TOS1 ^ TOS.

BINARY_OR

Implements TOS = TOS1 | TOS.

In-place operations

In-place operations are like binary operations, in that they remove TOS and TOS1, and push the result back on the stack, but the operation is done in-place when TOS1 supports it, and the resulting TOS may be (but does not have to be) the original TOS1.

INPLACE_POWER

Implements in-place TOS = TOS1 ** TOS.

INPLACE_MULTIPLY

Implements in-place TOS = TOS1 * TOS.

INPLACE_MATRIX_MULTIPLY

Implements in-place TOS = TOS1 @ TOS.

3.5 版新加入.

INPLACE_FLOOR_DIVIDE

Implements in-place TOS = TOS1 // TOS.

INPLACE_TRUE_DIVIDE

Implements in-place TOS = TOS1 / TOS.

INPLACE_MODULO

Implements in-place TOS = TOS1 % TOS.

INPLACE_ADD

Implements in-place TOS = TOS1 + TOS.

INPLACE_SUBTRACT

Implements in-place TOS = TOS1 - TOS.

INPLACE_LSHIFT

Implements in-place TOS = TOS1 << TOS.

INPLACE_RSHIFT

Implements in-place TOS = TOS1 >> TOS.

INPLACE_AND

Implements in-place TOS = TOS1 & TOS.

INPLACE_XOR

Implements in-place TOS = TOS1 ^ TOS.

INPLACE_OR

Implements in-place TOS = TOS1 | TOS.

STORE_SUBSCR

Implements TOS1[TOS] = TOS2.

DELETE_SUBSCR

Implements del TOS1[TOS].

Coroutine opcodes

GET_AWAITABLE

Implements TOS = get_awaitable(TOS), where get_awaitable(o) returns o if o is a coroutine object or a generator object with the CO_ITERABLE_COROUTINE flag, or resolves o.__await__.

3.5 版新加入.

GET_AITER

Implements TOS = TOS.__aiter__().

3.5 版新加入.

3.7 版更變: Returning awaitable objects from __aiter__ is no longer supported.

GET_ANEXT

Implements PUSH(get_awaitable(TOS.__anext__())). See GET_AWAITABLE for details about get_awaitable

3.5 版新加入.

BEFORE_ASYNC_WITH

Resolves __aenter__ and __aexit__ from the object on top of the stack. Pushes __aexit__ and result of __aenter__() to the stack.

3.5 版新加入.

SETUP_ASYNC_WITH

Creates a new frame object.

3.5 版新加入.

Miscellaneous opcodes

PRINT_EXPR

Implements the expression statement for the interactive mode. TOS is removed from the stack and printed. In non-interactive mode, an expression statement is terminated with POP_TOP.

BREAK_LOOP

Terminates a loop due to a break statement.

CONTINUE_LOOP(target)

Continues a loop due to a continue statement. target is the address to jump to (which should be a FOR_ITER instruction).

SET_ADD(i)

Calls set.add(TOS1[-i], TOS). Used to implement set comprehensions.

LIST_APPEND(i)

Calls list.append(TOS[-i], TOS). Used to implement list comprehensions.

MAP_ADD(i)

Calls dict.setitem(TOS1[-i], TOS, TOS1). Used to implement dict comprehensions.

3.1 版新加入.

For all of the SET_ADD, LIST_APPEND and MAP_ADD instructions, while the added value or key/value pair is popped off, the container object remains on the stack so that it is available for further iterations of the loop.

RETURN_VALUE

Returns with TOS to the caller of the function.

YIELD_VALUE

Pops TOS and yields it from a generator.

YIELD_FROM

Pops TOS and delegates to it as a subiterator from a generator.

3.3 版新加入.

SETUP_ANNOTATIONS

Checks whether __annotations__ is defined in locals(), if not it is set up to an empty dict. This opcode is only emitted if a class or module body contains variable annotations statically.

3.6 版新加入.

IMPORT_STAR

Loads all symbols not starting with '_' directly from the module TOS to the local namespace. The module is popped after loading all names. This opcode implements from module import *.

POP_BLOCK

Removes one block from the block stack. Per frame, there is a stack of blocks, denoting nested loops, try statements, and such.

POP_EXCEPT

Removes one block from the block stack. The popped block must be an exception handler block, as implicitly created when entering an except handler. In addition to popping extraneous values from the frame stack, the last three popped values are used to restore the exception state.

END_FINALLY

Terminates a finally clause. The interpreter recalls whether the exception has to be re-raised, or whether the function returns, and continues with the outer-next block.

LOAD_BUILD_CLASS

Pushes builtins.__build_class__() onto the stack. It is later called by CALL_FUNCTION to construct a class.

SETUP_WITH(delta)

This opcode performs several operations before a with block starts. First, it loads __exit__() from the context manager and pushes it onto the stack for later use by WITH_CLEANUP. Then, __enter__() is called, and a finally block pointing to delta is pushed. Finally, the result of calling the enter method is pushed onto the stack. The next opcode will either ignore it (POP_TOP), or store it in (a) variable(s) (STORE_FAST, STORE_NAME, or UNPACK_SEQUENCE).

3.2 版新加入.

WITH_CLEANUP_START

Cleans up the stack when a with statement block exits. TOS is the context manager’s __exit__() bound method. Below TOS are 1–3 values indicating how/why the finally clause was entered:

  • SECOND = None
  • (SECOND, THIRD) = (WHY_{RETURN,CONTINUE}), retval
  • SECOND =