Python随笔(三)虚拟机运行原理 原

说到Python的运行机制,就不得不从.pyc文件和字节码说起 PyCodeObject对象的创建时机是模块加载的时候,即import。

.pyc文件

  1. 执行 python test.py 会对test.py进行编译成字节码并解释执行,但不会生成test.pyc
  2. 如果test.py中加载了其他模块,如import urllib2,那么python会对urllib2.py进行编译成字节码,生成urllib2.pyc,然后对字节码解释执行。
  3. 如果想生成test.pyc,我们可以使用python内置模块py_compile来编译。也可以执行命令 python -m py_compile test.py 这样,就生成了test.pyc
  4. 加载模块时,如果同时存在.py和.pyc,python会使用.pyc运行,如果.pyc的编译时间早于.py的时间,则重新编译 .py文件,并更新.pyc文件。

PyCodeObject

Python代码的编译过程就是编译出PyCodeObject对象 下面是Python3.5.7的PyCodeObject定义

/* Bytecode object */
typedef struct {
    PyObject_HEAD
    int co_argcount;		/* #arguments, except *args CodeBlock中位置参数的个数 */
    int co_kwonlyargcount;	/* #keyword only arguments */
    int co_nlocals;		/* #local variables */
    int co_stacksize;		/* #entries needed for evaluation stack */
    int co_flags;		/* CO_..., see below */
    PyObject *co_code;		/* instruction opcodes */
    PyObject *co_consts;	/* list (constants used) */
    PyObject *co_names;		/* list of strings (names used) */
    PyObject *co_varnames;	/* tuple of strings (local variable names) */
    PyObject *co_freevars;	/* tuple of strings (free variable names) */
    PyObject *co_cellvars;      /* tuple of strings (cell variable names) */
    /* The rest aren't used in either hash or comparisons, except for
       co_name (used in both) and co_firstlineno (used only in
       comparisons).  This is done to preserve the name and line number
       for tracebacks and debuggers; otherwise, constant de-duplication
       would collapse identical functions/lambdas defined on different lines.
    */
    unsigned char *co_cell2arg; /* Maps cell vars which are arguments. */
    PyObject *co_filename;	/* unicode (where it was loaded from) */
    PyObject *co_name;		/* unicode (name, for reference) */
    int co_firstlineno;		/* first source line number */
    PyObject *co_lnotab;	/* string (encoding addr<->lineno mapping) See
				   Objects/lnotab_notes.txt for details. */
    void *co_zombieframe;     /* for optimization only (see frameobject.c) */
    PyObject *co_weakreflist;   /* to support weakrefs to code objects */
} PyCodeObject;
  1. co_argcount、co_kwonlyargcount

PEP 3102:http://www.python.org/dev/peps/pep-3102/

Keyword-only argument:在函数参数列表中,出现在*varargs之后的命名参数只能使用关键参数的形式调用。

函数调用是参数的赋值顺序:位置参数-->关键字参数-->可变参数(*varargs)

co_argcount:CodeBlock中位置参数的个数,即:在调用时出现的位置参数的个数(不包含可变参数*varargs)。

co_kwonlyargcount:CodeBlock中的关键参数的个数,即在调用时是出现在可变参数(*varargs)之后的参数个数,可变参数之后的参数均是形式为“keyvalue”的关键参数。

>>> def func(a, b, *d, c):
...     m = 1
...     pass
...
>>> func.__code__.co_argcount
>>> func.__code__.co_kwonlyargcount
  1. co_nlocals:Code Block中的所有局部变量的个数,包括code block的参数(co_argcount+co_kwonlyargcount+可变参数个数)+code block内的局部变量
>>> f1.__code__.co_nlocals
7

a、b、c、m,4个

  1. co_stacksize:执行该段Code Block需要的栈空间数
>>> f1.__code__.co_stacksize
1
  1. co_code:Code Block编译所得的字节码指令序列
>>> f1.__code__.co_code
b'd\x01\x00}\x06\x00d\x00\x00S'
  1. co_consts、co_names
  • co_consts:Code Block中的所有常量的元组
  • co_names:Code Block中的所有符号(名字)的元组
>>> f1.__code__.co_consts
(None, 1)
>>> f1.__code__.co_names
()
  1. co_filename、co_name
  • co_filename:Code Block所对应的的.py文件的完整路径
  • co_name:Code Block的名字,,通常是函数名或类名
>>> f1.__code__.co_filename
'<stdin>'#因为是在控制台里面所以是stdin
>>> f1.__code__.co_name
'f1'
  1. co_firstlineno:Code Block在对应的.py文件中的起始行 test_1.py
def func(a, b, c, *d, e, f):
    m = 1
    pass
print(f1.__code__.co_firstlineno)

输出

1
  1. co_varnames、co_freevars、co_cellvars
  • co_varnames:在本代码段中被赋值,但没有被内层代码段引用的变量
  • co_freevars(freevars:自由变量):在本代码段中被引用,在外层代码段中被赋值的变量
  • co_cellvars(cellvars:被内层代码所约束的变量):在本代码段中被赋值,且被内层代码段引用的变量

普通函数代码段测试

def func(a, b, c, *d, e, f):
    m = 1
    pass

print('co_argcount        :', func.__code__.co_argcount)
print('co_kwonlyargcount  :', func.__code__.co_kwonlyargcount)
print('co_nlocals         :', func.__code__.co_nlocals)
print('co_stacksize       :', func.__code__.co_stacksize)
print('co_flags           :', func.__code__.co_flags)
print('co_code            :', func.__code__.co_code)
print('co_consts          :', func.__code__.co_consts)
print('co_names           :', func.__code__.co_names)
print('co_varnames        :', func.__code__.co_varnames)
print('co_freevars        :', func.__code__.co_freevars)
print('co_cellvars        :', func.__code__.co_cellvars)
print('co_filename        :', func.__code__.co_filename)
print('co_name            :', func.__code__.co_name)
print('co_firstlineno     :', func.__code__.co_firstlineno)
print('co_lnotab          :', func.__code__.co_lnotab)

输出

co_argcount        : 3
co_kwonlyargcount  : 2
co_nlocals         : 7
co_stacksize       : 1
co_flags           : 71
co_code            : b'd\x01\x00}\x06\x00d\x00\x00S'
co_consts          : (None, 1)
co_names           : ()
co_varnames        : ('a', 'b', 'c', 'e', 'f', 'd', 'm')
co_freevars        : ()
co_cellvars        : ()
co_filename        : pyvm_test2_function.py
co_name            : func
co_firstlineno     : 1
co_lnotab          : b'\x00\x01\x06\x01'

嵌套函数代码测试:

def func(a, b, c, *d, e, f):
    m = 1
    def wapper():
        n = m
    print('wapper-->co_argcount        :', wapper.__code__.co_argcount)
    print('wapper-->co_kwonlyargcount  :', wapper.__code__.co_kwonlyargcount)
    print('wapper-->co_nlocals         :', wapper.__code__.co_nlocals)
    print('wapper-->co_stacksize       :', wapper.__code__.co_stacksize)
    print('wapper-->co_flags           :', wapper.__code__.co_flags)
    print('wapper-->co_code            :', wapper.__code__.co_code)
    print('wapper-->co_consts          :', wapper.__code__.co_consts)
    print('wapper-->co_names           :', wapper.__code__.co_names)
    print('wapper-->co_varnames        :', wapper.__code__.co_varnames)
    print('wapper-->co_freevars        :', wapper.__code__.co_freevars)
    print('wapper-->co_cellvars        :', wapper.__code__.co_cellvars)
    print('wapper-->co_filename        :', wapper.__code__.co_filename)
    print('wapper-->co_name            :', wapper.__code__.co_name)
    print('wapper-->co_firstlineno     :', wapper.__code__.co_firstlineno)
    print('wapper-->co_lnotab          :', wapper.__code__.co_lnotab)

print('func-->co_argcount        :', func.__code__.co_argcount)
print('func-->co_kwonlyargcount  :', func.__code__.co_kwonlyargcount)
print('func-->co_nlocals         :', func.__code__.co_nlocals)
print('func-->co_stacksize       :', func.__code__.co_stacksize)
print('func-->co_flags           :', func.__code__.co_flags)
print('func-->co_code            :', func.__code__.co_code)
print('func-->co_consts          :', func.__code__.co_consts)
print('func-->co_names           :', func.__code__.co_names)
print('func-->co_varnames        :', func.__code__.co_varnames)
print('func-->co_freevars        :', func.__code__.co_freevars)
print('func-->co_cellvars        :', func.__code__.co_cellvars)
print('func-->co_filename        :', func.__code__.co_filename)
print('func-->co_name            :', func.__code__.co_name)
print('func-->co_firstlineno     :', func.__code__.co_firstlineno)
print('func-->co_lnotab          :', func.__code__.co_lnotab)
print('=========================================================')
func(1, 2, 3, 4, 5, 6, 7, e = 8, f = 9)

输出

func-->co_argcount        : 3
func-->co_kwonlyargcount  : 2
func-->co_nlocals         : 7
func-->co_stacksize       : 3
func-->co_flags           : 7
func-->co_code            : b'd\x01\x00\x89\x00\x00\x87\x00\x00f\x01\x00d\x02\x00d\x03\x00\x86\x00\x00}\x06\x00t\x00\x00d\x04\x00|\x06\x00j\x01\x00j\x02\x00\x83\x02\x00\x01t\x00\x00d\x05\x00|\x06\x00j\x01\x00j\x03\x00\x83\x02\x00\x01t\x00\x00d\x06\x00|\x06\x00j\x01\x00j\x04\x00\x83\x02\x00\x01t\x00\x00d\x07\x00|\x06\x00j\x01\x00j\x05\x00\x83\x02\x00\x01t\x00\x00d\x08\x00|\x06\x00j\x01\x00j\x06\x00\x83\x02\x00\x01t\x00\x00d\t\x00|\x06\x00j\x01\x00j\x07\x00\x83\x02\x00\x01t\x00\x00d\n\x00|\x06\x00j\x01\x00j\x08\x00\x83\x02\x00\x01t\x00\x00d\x0b\x00|\x06\x00j\x01\x00j\t\x00\x83\x02\x00\x01t\x00\x00d\x0c\x00|\x06\x00j\x01\x00j\n\x00\x83\x02\x00\x01t\x00\x00d\r\x00|\x06\x00j\x01\x00j\x0b\x00\x83\x02\x00\x01t\x00\x00d\x0e\x00|\x06\x00j\x01\x00j\x0c\x00\x83\x02\x00\x01t\x00\x00d\x0f\x00|\x06\x00j\x01\x00j\r\x00\x83\x02\x00\x01t\x00\x00d\x10\x00|\x06\x00j\x01\x00j\x0e\x00\x83\x02\x00\x01t\x00\x00d\x11\x00|\x06\x00j\x01\x00j\x0f\x00\x83\x02\x00\x01t\x00\x00d\x12\x00|\x06\x00j\x01\x00j\x10\x00\x83\x02\x00\x01d\x00\x00S'
func-->co_consts          : (None, 1, <code object wapper at 0x000002A033189B70, file "pyvm_test3_function.py", line 3>, 'func.<locals>.wapper', 'wapper-->co_argcount        :', 'wapper-->co_kwonlyargcount  :', 'wapper-->co_nlocals         :', 'wapper-->co_stacksize       :', 'wapper-->co_flags           :', 'wapper-->co_code            :', 'wapper-->co_consts          :', 'wapper-->co_names           :', 'wapper-->co_varnames        :', 'wapper-->co_freevars        :', 'wapper-->co_cellvars        :', 'wapper-->co_filename        :', 'wapper-->co_name            :', 'wapper-->co_firstlineno     :', 'wapper-->co_lnotab          :')
func-->co_names           : ('print', '__code__', 'co_argcount', 'co_kwonlyargcount', 'co_nlocals', 'co_stacksize', 'co_flags', 'co_code', 'co_consts', 'co_names', 'co_varnames', 'co_freevars', 'co_cellvars', 'co_filename', 'co_name', 'co_firstlineno', 'co_lnotab')
func-->co_varnames        : ('a', 'b', 'c', 'e', 'f', 'd', 'wapper')
func-->co_freevars        : ()
func-->co_cellvars        : ('m',)
func-->co_filename        : pyvm_test3_function.py
func-->co_name            : func
func-->co_firstlineno     : 1
func-->co_lnotab          : b'\x00\x01\x06\x01\x12\x02\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01\x13\x01'
=========================================================
wapper-->co_argcount        : 0
wapper-->co_kwonlyargcount  : 0
wapper-->co_nlocals         : 1
wapper-->co_stacksize       : 1
wapper-->co_flags           : 19
wapper-->co_code            : b'\x88\x00\x00}\x00\x00d\x00\x00S'
wapper-->co_consts          : (None,)
wapper-->co_names           : ()
wapper-->co_varnames        : ('n',)
wapper-->co_freevars        : ('m',)
wapper-->co_cellvars        : ()
wapper-->co_filename        : pyvm_test3_function.py
wapper-->co_name            : wapper
wapper-->co_firstlineno     : 3
wapper-->co_lnotab          : b'\x00\x01'

闭包函数测试: 输出:

func-->co_argcount        : 3
func-->co_kwonlyargcount  : 2
func-->co_nlocals         : 7
func-->co_stacksize       : 3
func-->co_flags           : 7
func-->co_code            : b'd\x01\x00\x89\x00\x00\x87\x00\x00f\x01\x00d\x02\x00d\x03\x00\x86\x00\x00}\x06\x00|\x06\x00S'
func-->co_consts          : (None, 1, <code object wapper at 0x0000019920289B70, file "pyvm_test4_function.py", line 3>, 'func.<locals>.wapper')
func-->co_names           : ()
func-->co_varnames        : ('a', 'b', 'c', 'e', 'f', 'd', 'wapper')
func-->co_freevars        : ()
func-->co_cellvars        : ('m',)
func-->co_filename        : pyvm_test4_function.py
func-->co_name            : func
func-->co_firstlineno     : 1
func-->co_lnotab          : b'\x00\x01\x06\x01\x12\x02'
=========================================================
f3-->co_argcount        : 0
f3-->co_kwonlyargcount  : 0
f3-->co_nlocals         : 1
f3-->co_stacksize       : 1
f3-->co_flags           : 19
f3-->co_code            : b'\x88\x00\x00}\x00\x00d\x00\x00S'
f3-->co_consts          : (None,)
f3-->co_names           : ()
f3-->co_varnames        : ('n',)
f3-->co_freevars        : ('m',)
f3-->co_cellvars        : ()
f3-->co_filename        : pyvm_test4_function.py
f3-->co_name            : wapper
f3-->co_firstlineno     : 3
f3-->co_lnotab          : b'\x00\x01'

(9)co_lnotab:字节码指令与.pyc文件中的source code行号的对于关系

Object/lnotab_notes.txt: All about co_lnotab, the line number table. Code objects store a field named co_lnotab. This is an array > of unsigned bytes disguised as a Python string. It is used to map bytecode offsets to source code line #s for tracebacks and to identify line number boundaries for line tracing. The array is conceptually a compressed list of (bytecode > offset increment, line number increment) pairs. The details > are important and delicate, best illustrated by example:

byte code offset

source code line number

0

1

6

2

50

7

350

307

361

308

Instead of storing these numbers literally, we compress the list by storing only the increments from one row to the next. Conceptually, the stored list might look like:

0, 1, 6, 1, 44, 5, 300, 300, 11, 1 形成的数组:0, 1, (0+6), (1+1), (6+44), (2+5), (50+300), (7+300), (350+11), (307+1)

参考文献:

[python虚拟机运行原理]https://www.cnblogs.com/webber1992/p/6597166.html

(adsbygoogle = window.adsbygoogle || []).push({});

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

扫码关注云+社区

领取腾讯云代金券