CPython-05-解释器核心-概览
1. 模块职责
解释器核心(Interpreter Core)是 CPython 执行引擎的心脏,负责执行编译后的字节码。该模块实现了基于栈的虚拟机,通过循环读取和执行字节码指令来运行 Python 程序。
核心职责
- 字节码执行:逐条执行 Code Object 中的字节码指令
- 栈帧管理:维护函数调用栈和执行上下文
- 求值栈操作:管理操作数栈,支持指令的 push/pop 操作
- 指令分发:根据 opcode 分发到对应的处理逻辑
- 异常处理:捕获和传播异常,执行 try-except 逻辑
- 性能优化:内联缓存、指令特化、JIT 编译(Tier 2)
输入与输出
输入:
- Code Object(PyCodeObject*)
- 全局命名空间(globals)
- 局部命名空间(locals)
- 参数(args, kwargs)
输出:
- 返回值(PyObject*)
- 异常信息(通过 PyErr_* 设置)
- 副作用(对象修改、I/O 操作等)
上下游依赖
上游(调用方):
PyRun_*系列函数:执行代码字符串、文件- 函数调用:Python 函数互相调用
eval(),exec():动态执行
下游(被调用):
Objects/*:所有对象操作Python/compile.c:动态编译(eval)Modules/*:内置模块函数调用Python/errors.c:异常设置和传播
生命周期
stateDiagram-v2
[*] --> 创建Frame: PyEval_EvalCode()
创建Frame --> 初始化栈: 分配栈空间
初始化栈 --> 进入解释器: _PyEval_EvalFrameDefault()
进入解释器 --> 指令循环: DISPATCH()
指令循环 --> 取指令: next_instr++
取指令 --> 解码: opcode, oparg
解码 --> 执行指令: switch(opcode)
执行指令 --> 正常完成: 大多数指令
正常完成 --> 检查信号: check_periodics()
检查信号 --> 指令循环: 继续
执行指令 --> 函数调用: CALL
函数调用 --> 内联调用: Python函数
内联调用 --> 指令循环: 栈帧切换
执行指令 --> 返回: RETURN_VALUE
返回 --> 弹出栈帧: 恢复调用者
弹出栈帧 --> 指令循环: 继续调用者
弹出栈帧 --> [*]: 顶层返回
执行指令 --> 异常: ERROR
异常 --> 查找Handler: 异常表
查找Handler --> 指令循环: 找到Handler
查找Handler --> 展开栈帧: 未找到
展开栈帧 --> 查找Handler: 上层Frame
展开栈帧 --> [*]: 未捕获异常
2. 整体架构图
flowchart TB
subgraph 入口层
A[PyEval_EvalCode] --> B[PyEval_EvalFrame]
B --> C[_PyEval_EvalFrame]
C --> D[_PyEval_EvalFrameDefault]
end
subgraph 栈帧管理
E[_PyInterpreterFrame] --> F[stack_pointer]
E --> G[instr_ptr]
E --> H[locals + stack]
E --> I[previous frame]
end
subgraph 指令执行循环
J[DISPATCH宏] --> K[取指令]
K --> L[解码]
L --> M{USE_COMPUTED_GOTOS?}
M -->|是| N[跳转表]
M -->|否| O[Switch语句]
N --> P[执行指令]
O --> P
end
subgraph 指令实现
P --> Q[栈操作指令]
P --> R[对象操作指令]
P --> S[控制流指令]
P --> T[函数调用指令]
P --> U[异常处理指令]
end
subgraph 性能优化
V[内联缓存] --> W[LOAD_ATTR缓存]
V --> X[LOAD_GLOBAL缓存]
V --> Y[CALL缓存]
Z[指令特化] --> AA[quickening]
Z --> AB[adaptive指令]
AB --> AC[专用版本]
AD[Tier 2优化] --> AE[_PyUOpExecutor]
AD --> AF[trace缓存]
AD --> AG[JIT编译]
end
subgraph 异常处理
AH[异常发生] --> AI[PyErr_Occurred]
AI --> AJ[查找异常表]
AJ --> AK[展开栈]
AK --> AL[执行Handler]
AL --> AM{处理成功?}
AM -->|是| J
AM -->|否| AH
end
subgraph 周期性检查
AN[eval_breaker] --> AO[信号处理]
AN --> AP[GIL释放请求]
AN --> AQ[异步异常]
AN --> AR[待处理调用]
AN --> AS[GC触发]
end
D --> E
D --> J
P --> V
P --> Z
P --> AD
P --> AH
J --> AN
架构说明
1. 核心数据结构
_PyInterpreterFrame(解释器栈帧)
typedef struct _PyInterpreterFrame {
PyObject *f_executable; // Code Object 或函数对象
struct _PyInterpreterFrame *previous; // 调用者栈帧
_Py_CODEUNIT *instr_ptr; // 当前指令指针
_PyStackRef *stackpointer; // 栈顶指针
uint16_t return_offset; // 返回偏移
char owner; // 栈帧所有者
bool is_entry; // 是否入口帧
_PyStackRef localsplus[1]; // 局部变量+栈空间(变长)
} _PyInterpreterFrame;
字段说明:
-
f_executable:可执行对象- Code Object:编译后的字节码
- Function Object:包含 Code Object 和闭包
-
previous:调用链- 形成单链表
- 用于回溯和异常传播
-
instr_ptr:指令指针- 指向下一条待执行指令
- 类型:
_Py_CODEUNIT*(16位代码单元)
-
stackpointer:栈指针- 指向求值栈顶
- 上长下短(向高地址增长)
-
localsplus:统一存储区- 前 N 个:局部变量(co_nlocals)
- 后 M 个:求值栈(co_stacksize)
- 闭包变量(co_nfreevars)
内存布局:
+-------------------+
| f_executable |
| previous |
| instr_ptr |
| stackpointer |
| return_offset |
| owner, is_entry |
+-------------------+
| localsplus[0] | <- 局部变量 0
| localsplus[1] | <- 局部变量 1
| ... |
| localsplus[N-1] | <- 局部变量 N-1
+-------------------+
| localsplus[N] | <- 栈底
| localsplus[N+1] |
| ... | <- 求值栈
| localsplus[N+M-1] | <- 栈顶(最大)
+-------------------+
PyCodeObject(代码对象)
struct PyCodeObject {
PyObject_HEAD
int co_argcount; // 参数数量
int co_posonlyargcount; // 仅位置参数数量
int co_kwonlyargcount; // 仅关键字参数数量
int co_stacksize; // 栈深度
int co_firstlineno; // 首行号
PyObject *co_code; // 字节码(bytes对象)
PyObject *co_consts; // 常量元组
PyObject *co_names; // 名称元组
PyObject *co_localsplusnames; // 局部变量名
PyObject *co_exceptiontable; // 异常表
int co_flags; // 标志位
_PyCoCached *_co_cached; // 缓存(quickening数据)
uint64_t _co_instrumentation_version; // 版本号
_Py_CODEUNIT *co_code_adaptive; // 自适应字节码
...
};
关键字段:
co_code:原始字节码(不可变)co_code_adaptive:可修改版本(用于 quickening)co_stacksize:最大栈深度(编译时计算)co_exceptiontable:异常处理表
标志位(co_flags):
#define CO_OPTIMIZED 0x0001 // 使用 LOAD_FAST
#define CO_NEWLOCALS 0x0002 // 新局部命名空间
#define CO_VARARGS 0x0004 // *args
#define CO_VARKEYWORDS 0x0008 // **kwargs
#define CO_GENERATOR 0x0020 // 生成器
#define CO_COROUTINE 0x0080 // 协程
#define CO_ASYNC_GENERATOR 0x0200 // 异步生成器
2. 指令格式
代码单元(_Py_CODEUNIT)
typedef uint16_t _Py_CODEUNIT;
// 位分布(16位)
// | opcode (8位) | oparg (8位) |
提取宏:
#define _Py_OPCODE(word) ((word) & 0xFF)
#define _Py_OPARG(word) ((word) >> 8)
EXTENDED_ARG(扩展参数)
对于大于 255 的 oparg:
EXTENDED_ARG high_byte
REAL_OPCODE low_byte
累积计算:
oparg = 0;
while (opcode == EXTENDED_ARG) {
oparg = (oparg << 8) | next_oparg;
// 读取下一条指令
}
内联缓存
某些指令后跟缓存条目:
LOAD_ATTR oparg
CACHE entry1
CACHE entry2
...
缓存结构(示例:LOAD_ATTR):
typedef struct {
_Py_CODEUNIT counter; // 计数器
_Py_CODEUNIT type_version; // 类型版本
_Py_CODEUNIT index; // 属性索引
} _PyLoadAttrCache;
3. 执行循环
主循环(简化版)
PyObject *
_PyEval_EvalFrameDefault(PyThreadState *tstate,
_PyInterpreterFrame *frame,
int throwflag)
{
// 1. 初始化局部变量
_Py_CODEUNIT *next_instr = frame->instr_ptr;
_PyStackRef *stack_pointer = frame->stackpointer;
PyCodeObject *co = (PyCodeObject *)frame->f_executable;
// 2. 主循环
for (;;) {
// 2.1 取指令
_Py_CODEUNIT word = *next_instr++;
uint8_t opcode = _Py_OPCODE(word);
uint8_t oparg = _Py_OPARG(word);
// 2.2 处理 EXTENDED_ARG
while (opcode == EXTENDED_ARG) {
word = *next_instr++;
opcode = _Py_OPCODE(word);
oparg = (oparg << 8) | _Py_OPARG(word);
}
// 2.3 分发执行
switch (opcode) {
case NOP:
break;
case LOAD_FAST:
// 加载局部变量
value = GETLOCAL(oparg);
PUSH(value);
break;
case STORE_FAST:
// 存储局部变量
value = POP();
SETLOCAL(oparg, value);
break;
case LOAD_CONST:
// 加载常量
value = GETITEM(co->co_consts, oparg);
PUSH(value);
break;
case BINARY_OP:
// 二元运算
right = POP();
left = POP();
result = binary_op(left, oparg, right);
PUSH(result);
break;
case RETURN_VALUE:
// 返回
retval = POP();
goto exit_eval_frame;
// ... 更多指令
}
// 2.4 周期性检查
if (eval_breaker) {
if (check_periodics(tstate) < 0) {
goto error;
}
}
}
exit_eval_frame:
// 清理并返回
return retval;
error:
// 异常处理
return NULL;
}
计算 GOTO(优化)
传统 Switch:
switch (opcode) {
case LOAD_FAST: ...
case STORE_FAST: ...
...
}
- 单个间接跳转
- CPU 难以预测下一条指令
Computed GOTO(GCC 扩展):
static void *jump_table[256] = {
&&TARGET_LOAD_FAST,
&&TARGET_STORE_FAST,
...
};
DISPATCH() {
goto *jump_table[opcode];
}
TARGET_LOAD_FAST:
// 指令实现
DISPATCH();
TARGET_STORE_FAST:
// 指令实现
DISPATCH();
优势:
- 每个指令有独立的跳转地址
- CPU 为每个指令单独预测
- 性能提升:15-20%
4. 指令类别
栈操作指令
// NOP - 无操作
inst(NOP) {
// 什么都不做
}
// POP_TOP - 弹出栈顶
inst(POP_TOP) {
PyObject *value = POP();
Py_DECREF(value);
}
// COPY - 复制栈上元素
inst(COPY, (value -- value, value)) {
// value 保持在栈上,并复制一份
Py_INCREF(value);
}
// SWAP - 交换栈上元素
inst(SWAP, (top, bottom -- bottom, top)) {
// 交换两个栈元素
}
变量加载/存储
// LOAD_FAST - 加载局部变量
inst(LOAD_FAST, (-- value)) {
value = GETLOCAL(oparg);
if (value == NULL) {
// 未绑定局部变量
format_exc_check_arg(UnboundLocalError, ...);
goto error;
}
}
// STORE_FAST - 存储局部变量
inst(STORE_FAST, (value --)) {
SETLOCAL(oparg, value);
}
// LOAD_CONST - 加载常量
inst(LOAD_CONST, (-- value)) {
value = GETITEM(co->co_consts, oparg);
}
// LOAD_GLOBAL - 加载全局变量
family(LOAD_GLOBAL) = {
LOAD_GLOBAL_MODULE, // 专用:模块全局
LOAD_GLOBAL_BUILTIN, // 专用:内置
};
inst(LOAD_GLOBAL, (-- value, null if (oparg & 1))) {
// 1. 查找全局命名空间
value = PyDict_GetItem(GLOBALS(), name);
if (value == NULL) {
// 2. 查找内置命名空间
value = PyDict_GetItem(BUILTINS(), name);
if (value == NULL) {
format_exc_check_arg(NameError, ...);
goto error;
}
}
}
属性访问
// LOAD_ATTR - 加载属性
family(LOAD_ATTR) = {
LOAD_ATTR_INSTANCE_VALUE, // 实例属性(共享键)
LOAD_ATTR_MODULE, // 模块属性
LOAD_ATTR_WITH_HINT, // 带提示
LOAD_ATTR_SLOT, // __slots__
LOAD_ATTR_METHOD, // 方法
LOAD_ATTR_PROPERTY, // property
LOAD_ATTR_GETATTRIBUTE_OVERRIDDEN, // 重写 __getattribute__
};
inst(LOAD_ATTR, (owner -- attr, self_or_null if (oparg & 1))) {
PyObject *name = GETITEM(co->co_names, oparg >> 1);
attr = PyObject_GetAttr(owner, name);
if (attr == NULL) goto error;
Py_DECREF(owner);
}
// 专用版本:实例属性
inst(LOAD_ATTR_INSTANCE_VALUE, (owner -- attr, null)) {
PyDictObject *dict = (PyDictObject *)owner->ob_dict;
// 使用缓存的索引直接访问
attr = dict->ma_values->values[index];
}
二元运算
// BINARY_OP - 二元运算
family(BINARY_OP) = {
BINARY_OP_ADD_INT, // 整数加法
BINARY_OP_ADD_FLOAT, // 浮点加法
BINARY_OP_ADD_UNICODE, // 字符串连接
BINARY_OP_MULTIPLY_INT, // 整数乘法
BINARY_OP_MULTIPLY_FLOAT, // 浮点乘法
// ... 更多专用版本
};
inst(BINARY_OP, (lhs, rhs -- result)) {
result = binary_ops[oparg](lhs, rhs);
Py_DECREF(lhs);
Py_DECREF(rhs);
if (result == NULL) goto error;
}
// 专用版本:整数加法
inst(BINARY_OP_ADD_INT, (left, right -- result)) {
// 快速路径:小整数
if (is_small_int(left) && is_small_int(right)) {
result = add_small_ints(left, right);
} else {
result = PyNumber_Add(left, right);
}
}
控制流
// JUMP_FORWARD - 无条件前向跳转
inst(JUMP_FORWARD) {
JUMPBY(oparg);
}
// JUMP_BACKWARD - 无条件后向跳转(循环)
inst(JUMP_BACKWARD) {
_Py_CODEUNIT *next = next_instr - oparg;
next_instr = next;
// 检查 eval breaker(后向跳转)
CHECK_EVAL_BREAKER();
}
// POP_JUMP_IF_FALSE - 条件跳转
inst(POP_JUMP_IF_FALSE, (cond --)) {
int is_true = PyObject_IsTrue(cond);
Py_DECREF(cond);
if (is_true < 0) goto error;
if (is_true == 0) {
JUMPBY(oparg);
}
}
// FOR_ITER - 迭代器循环
inst(FOR_ITER, (iter -- iter, next)) {
next = (*Py_TYPE(iter)->tp_iternext)(iter);
if (next == NULL) {
if (PyErr_Occurred()) {
if (!PyErr_ExceptionMatches(PyExc_StopIteration)) {
goto error;
}
PyErr_Clear();
}
// 迭代结束
Py_DECREF(iter);
JUMPBY(oparg);
}
}
函数调用
// CALL - 函数调用
family(CALL) = {
CALL_PY_EXACT_ARGS, // Python函数,精确参数
CALL_PY_WITH_DEFAULTS, // Python函数,带默认值
CALL_BOUND_METHOD_EXACT_ARGS, // 绑定方法
CALL_BUILTIN_O, // 内置函数,单参数
CALL_BUILTIN_FAST, // 内置函数,快速调用
CALL_METHOD_DESCRIPTOR_O, // 方法描述符
// ... 更多专用版本
};
inst(CALL, (callable, args[oparg] -- result)) {
// 1. 准备参数
PyObject **args_array = stack_pointer - oparg;
// 2. 调用
result = PyObject_Vectorcall(callable, args_array, oparg, NULL);
// 3. 清理
Py_DECREF(callable);
for (int i = 0; i < oparg; i++) {
Py_DECREF(args_array[i]);
}
if (result == NULL) goto error;
}
// 专用版本:Python函数调用(内联)
inst(CALL_PY_EXACT_ARGS, (callable, args[oparg] -- result)) {
PyFunctionObject *func = (PyFunctionObject *)callable;
PyCodeObject *code = (PyCodeObject *)func->func_code;
// 1. 创建新栈帧
_PyInterpreterFrame *new_frame =
_PyFrame_PushUnchecked(tstate, func, oparg);
// 2. 复制参数到新栈帧
for (int i = 0; i < oparg; i++) {
new_frame->localsplus[i] = args[i];
}
// 3. 切换到新栈帧(内联调用)
frame = new_frame;
next_instr = code->co_code_adaptive;
stack_pointer = _PyFrame_Stackbase(frame);
goto start_frame; // 不返回,直接执行
}
异常处理
// RAISE_VARARGS - 抛出异常
inst(RAISE_VARARGS, (args[oparg] --)) {
PyObject *exc = NULL, *cause = NULL;
switch (oparg) {
case 2:
cause = args[1]; // from 子句
// fall through
case 1:
exc = args[0];
break;
case 0:
// 重新抛出当前异常
exc = get_current_exception(tstate);
break;
}
do_raise(tstate, exc, cause);
goto error;
}
// PUSH_EXC_INFO - 推入异常信息
inst(PUSH_EXC_INFO, (new_exc -- prev_exc, new_exc)) {
prev_exc = tstate->exc_info->exc_value;
tstate->exc_info->exc_value = new_exc;
}
// CHECK_EXC_MATCH - 检查异常匹配
inst(CHECK_EXC_MATCH, (left, right -- left, result)) {
// 检查 left 是否是 right 的实例
result = PyErr_GivenExceptionMatches(left, right) ? Py_True : Py_False;
}
5. 性能优化技术
5.1 内联缓存(Inline Caching)
原理:在指令后存储额外信息,加速重复执行
LOAD_ATTR 示例:
typedef struct {
_Py_CODEUNIT counter; // 执行计数
_Py_CODEUNIT type_version; // 类型版本
_Py_CODEUNIT index; // 字典索引
_Py_CODEUNIT unused; // 对齐
} _PyLoadAttrCache;
inst(LOAD_ATTR_INSTANCE_VALUE, (owner -- attr, null)) {
// 获取缓存
_PyLoadAttrCache *cache = (_PyLoadAttrCache *)next_instr;
// 1. 检查类型版本
PyTypeObject *tp = Py_TYPE(owner);
if (tp->tp_version_tag != cache->type_version) {
// 缓存失效,回退到通用版本
goto deopt;
}
// 2. 使用缓存的索引直接访问
PyDictObject *dict = (PyDictObject *)owner->__dict__;
PyObject **values = dict->ma_values->values;
attr = values[cache->index];
// 3. 跳过缓存条目
next_instr += sizeof(_PyLoadAttrCache) / sizeof(_Py_CODEUNIT);
}
命中率:
- LOAD_ATTR:~95%
- LOAD_GLOBAL:~98%
- CALL:~90%
效果:
- 首次执行:~200ns
- 缓存命中:~20ns
- 加速比:10倍
5.2 指令特化(Quickening)
流程:
flowchart LR
A[通用指令] --> B[执行计数]
B --> C{达到阈值?}
C -->|否| A
C -->|是| D[分析运行时类型]
D --> E{可特化?}
E -->|是| F[替换为专用指令]
E -->|否| G[标记为不可特化]
F --> H[快速执行]
G --> A
示例:BINARY_OP 特化
# 原始代码
x = a + b
# 字节码(初始)
BINARY_OP 0 (ADD)
# 执行几次后,如果 a 和 b 总是 int
# 替换为:
BINARY_OP_ADD_INT
# 如果是 float
BINARY_OP_ADD_FLOAT
# 如果是 str
BINARY_OP_ADD_UNICODE
专用版本性能:
通用版本:100ns
专用 int:20ns
专用 float:25ns
专用 str:30ns
Adaptive 指令:
inst(BINARY_OP) {
// 检查计数器
_PyBinaryOpCache *cache = (_PyBinaryOpCache *)next_instr;
if (cache->counter-- == 0) {
// 尝试特化
_Py_Specialize_BinaryOp(left, right, oparg, next_instr);
}
// 执行通用版本
result = PyNumber_BinaryOp(left, oparg, right);
}
5.3 Tier 2 优化器(实验性)
架构:
Tier 1: 字节码解释器(默认)
↓ 检测热点
Tier 2: 微指令(μops)优化器
↓ 转换
Trace: 优化的微指令序列
↓ 可选
JIT: 机器码(未来)
Trace 生成:
# Python 代码
def fib(n):
if n <= 1:
return 1
return fib(n-1) + fib(n-2)
# Tier 1 字节码
LOAD_FAST n
LOAD_CONST 1
COMPARE_OP <=
POP_JUMP_IF_FALSE label
LOAD_CONST 1
RETURN_VALUE
label:
...
# Tier 2 Trace(假设 n > 1)
_LOAD_FAST n
_LOAD_CONST 1
_GUARD_NOT_TRUE // 假设条件为 False
_LOAD_GLOBAL fib
_LOAD_FAST n
_LOAD_CONST 1
_BINARY_OP_SUBTRACT_INT
_CALL_PY_EXACT_ARGS
...
优势:
- 消除分支预测失败
- 内联简单函数
- 跨指令优化
- 典型加速:10-30%
启用:
import sys
sys._setopt('optimize', 2) # 启用 Tier 2
6. 异常处理机制
异常表格式
// 异常表条目
typedef struct {
int start; // 起始偏移
int end; // 结束偏移
int target; // 处理器偏移
int depth; // 栈深度
bool lasti; // 是否保存 lasti
} ExceptionTableEntry;
Python 示例:
def foo():
try: # start=0
x = 1/0
except ZeroDivisionError: # target=10
print("error")
# end=5
异常表:
| start | end | target | depth | lasti |
|-------|-----|--------|-------|-------|
| 0 | 5 | 10 | 0 | yes |
异常处理流程
// 异常发生
error:
// 1. 获取当前异常
PyObject *exc = PyErr_GetRaisedException();
// 2. 查找异常表
int offset = (next_instr - co->co_code_adaptive);
ExceptionTableEntry *entry = find_exception_handler(
co->co_exceptiontable, offset);
if (entry != NULL) {
// 3. 找到处理器
// 3.1 调整栈深度
while (stack_pointer > stack_base + entry->depth) {
Py_DECREF(POP());
}
// 3.2 推入异常信息
PUSH(exc);
// 3.3 跳转到处理器
next_instr = co->co_code_adaptive + entry->target;
goto start_frame;
} else {
// 4. 未找到处理器,传播到上层
frame = frame->previous;
if (frame == NULL) {
// 顶层,终止程序
PyErr_Display(exc);
return NULL;
}
// 继续在上层查找
goto error;
}
finally 子句:
try:
x = 1/0
finally:
cleanup()
- finally 会生成两个处理器:
- 正常路径(end 后)
- 异常路径(exception target)
7. 周期性检查(Eval Breaker)
触发条件
// eval_breaker 是位掩码
#define _PY_SIGNAL_PENDING (1 << 0) // 信号待处理
#define _PY_GIL_DROP_REQUEST (1 << 1) // GIL释放请求
#define _PY_ASYNC_EXCEPTION (1 << 2) // 异步异常
#define _PY_PENDING_CALL (1 << 3) // 待处理调用
#define _PY_GC_PENDING (1 << 4) // GC请求
#define _PY_EVAL_TRACING (1 << 5) // 追踪模式
检查时机
// 后向跳转(循环)
inst(JUMP_BACKWARD) {
JUMPBY(-oparg);
CHECK_EVAL_BREAKER(); // 每次循环都检查
}
// 函数调用后
inst(CALL) {
result = PyObject_Call(callable, args, kwargs);
CHECK_EVAL_BREAKER(); // 调用后检查
}
// RESUME 指令(函数入口)
inst(RESUME) {
CHECK_EVAL_BREAKER();
}
处理逻辑
int check_periodics(PyThreadState *tstate)
{
uintptr_t eval_breaker = tstate->eval_breaker;
// 1. 信号处理(仅主线程)
if (eval_breaker & _PY_SIGNAL_PENDING) {
if (PyErr_CheckSignals() < 0) {
return -1;
}
}
// 2. GIL 释放请求
if (eval_breaker & _PY_GIL_DROP_REQUEST) {
drop_gil(tstate);
// 等待其他线程
take_gil(tstate);
}
// 3. 异步异常
if (eval_breaker & _PY_ASYNC_EXCEPTION) {
PyObject *exc = tstate->async_exc;
tstate->async_exc = NULL;
PyErr_SetRaisedException(exc);
return -1;
}
// 4. 待处理调用
if (eval_breaker & _PY_PENDING_CALL) {
if (make_pending_calls(tstate) < 0) {
return -1;
}
}
// 5. GC
if (eval_breaker & _PY_GC_PENDING) {
_PyGC_Collect(tstate);
}
return 0;
}
频率:
- 后向跳转:每次循环
- 函数调用:每次调用后
- RESUME:函数入口
- 典型:每 100 条指令
8. 关键路径分析
简单加法的执行路径
x = a + b
字节码:
LOAD_FAST 0 (a)
LOAD_FAST 1 (b)
BINARY_OP 0 (ADD)
STORE_FAST 2 (x)
执行流程(微秒级):
-
LOAD_FAST 0:0.02μs
- 读取
localsplus[0] - 增加引用计数
- 压栈
- 读取
-
LOAD_FAST 1:0.02μs
- 读取
localsplus[1] - 增加引用计数
- 压栈
- 读取
-
BINARY_OP 0:0.05-0.5μs(取决于类型)
- 弹出两个操作数
- 类型检查
- 调用
tp_as_number->nb_add - 如果是小整数:快速路径 0.05μs
- 如果是大整数:0.2μs
- 如果是浮点:0.1μs
- 如果是字符串:0.5μs
- 减少引用计数
- 压栈结果
-
STORE_FAST 2:0.02μs
- 弹出栈顶
- 存入
localsplus[2] - 如果原值存在,减少引用计数
总计:
- 最快(小整数):0.11μs
- 典型(普通对象):0.3μs
函数调用的执行路径
def add(a, b):
return a + b
result = add(1, 2)
字节码(调用者):
LOAD_GLOBAL 0 (add)
LOAD_CONST 1 (1)
LOAD_CONST 2 (2)
CALL 2
STORE_FAST 0 (result)
执行流程:
- LOAD_GLOBAL add:0.1μs(缓存命中)
- LOAD_CONST × 2:0.04μs
- CALL 2:
- 通用路径:1.0μs
- 参数打包
- 创建栈帧
- 切换上下文
- 特化路径(CALL_PY_EXACT_ARGS):0.3μs
- 直接内联栈帧
- 参数复制
- 指令指针切换
- 通用路径:1.0μs
- 函数体执行:0.11μs(前面分析的加法)
- RETURN_VALUE:0.1μs
- 弹出返回值
- 恢复调用者栈帧
- STORE_FAST result:0.02μs
总计:
- 通用路径:1.37μs
- 特化路径:0.67μs
- 优化效果:2倍
9. 边界与限制
递归深度
# 默认限制
import sys
print(sys.getrecursionlimit()) # 1000
# 修改限制
sys.setrecursionlimit(2000)
实现:
int _Py_EnterRecursiveCallTstate(PyThreadState *tstate, const char *where)
{
if (--tstate->py_recursion_remaining <= 0) {
tstate->py_recursion_remaining += 50; // 留余地处理异常
_PyErr_Format(tstate, PyExc_RecursionError,
"maximum recursion depth exceeded%s", where);
tstate->py_recursion_remaining -= 50;
return -1;
}
return 0;
}
注意:
- C 栈也有限制(通常 8MB)
- Python 限制防止 C 栈溢出
- 实际可用深度 < 设置值(因为还有其他 C 调用)
栈大小
def deep_expr():
return (((((1))))) # 深度嵌套表达式
编译时计算:
// Python/compile.c
int stackdepth(struct compiler *c)
{
// 遍历 CFG,计算最大栈深度
for (basicblock *b : cfg) {
int depth = simulate_block(b);
maxdepth = max(maxdepth, depth);
}
return maxdepth;
}
限制:
- 理论:受限于内存
- 实践:通常 < 1000
- 过深表达式会导致编译时溢出
性能限制
指令吞吐量:
- 简单指令:10-50M ops/s
- 复杂指令:1-10M ops/s
- 平均:5-10M ops/s
瓶颈:
- 引用计数开销:20-30%
- 指令分发:10-20%
- 对象操作:30-40%
- GIL:多核无加速
10. 调试技巧
查看字节码
import dis
def foo(x):
return x + 1
dis.dis(foo)
输出:
2 0 LOAD_FAST 0 (x)
2 LOAD_CONST 1 (1)
4 BINARY_OP 0 (+)
8 RETURN_VALUE
追踪执行
# 编译时启用调试
./configure --with-pydebug
make
# 运行时启用追踪
python -d script.py # 详细日志
性能分析
import sys
# 启用统计
sys._setopt('stats', 1)
# 运行代码
...
# 查看统计
sys._getstats()
本文档提供了 CPython 解释器核心的全面概览。通过理解字节码执行、栈帧管理、性能优化等机制,可以深入掌握 Python 程序的运行原理。
10. 解释器核心API源码深度剖析
10.1 字节码执行核心:_PyEval_EvalFrameDefault
// Python/ceval.c
PyObject* _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int throwflag)
{
_Py_CODEUNIT *next_instr;
PyObject **stack_pointer;
// 初始化
next_instr = frame->prev_instr + 1;
stack_pointer = _PyFrame_GetStackPointer(frame);
// 主解释循环
for (;;) {
_Py_CODEUNIT word = *next_instr++;
int opcode = _Py_OPCODE(word);
int oparg = _Py_OPARG(word);
// 指令分发
switch (opcode) {
case LOAD_FAST: {
PyObject *value = GETLOCAL(oparg);
if (value == NULL) {
format_exc_check_arg(tstate, PyExc_UnboundLocalError, ...);
goto error;
}
Py_INCREF(value);
PUSH(value);
DISPATCH();
}
case STORE_FAST: {
PyObject *value = POP();
SETLOCAL(oparg, value);
DISPATCH();
}
case BINARY_OP: {
PyObject *rhs = POP();
PyObject *lhs = TOP();
PyObject *res = _PyEval_BinaryOps[oparg](lhs, rhs);
SET_TOP(res);
Py_DECREF(lhs);
Py_DECREF(rhs);
if (res == NULL) goto error;
DISPATCH();
}
case CALL: {
PyObject **args = stack_pointer - oparg - 1;
PyObject *callable = args[-1];
PyObject *res = _PyObject_Vectorcall(callable, args, oparg, NULL);
if (res == NULL) goto error;
stack_pointer = args;
PUSH(res);
DISPATCH();
}
// ... 更多指令
}
}
}
解释器执行流程UML时序图:
sequenceDiagram
autonumber
participant Call as 函数调用
participant Eval as _PyEval_EvalFrameDefault
participant Frame as 栈帧
participant Stack as 值栈
participant Instr as 指令
Call->>Eval: 执行代码对象
Eval->>Frame: 创建/初始化栈帧
Frame->>Stack: 初始化值栈
loop 解释循环
Eval->>Instr: 获取下一条指令
Instr-->>Eval: opcode + oparg
alt LOAD指令
Eval->>Frame: 从局部变量加载
Frame-->>Stack: 压入值
else STORE指令
Stack-->>Eval: 弹出值
Eval->>Frame: 存储到局部变量
else BINARY_OP
Stack-->>Eval: 弹出2个操作数
Eval->>Eval: 执行操作
Eval-->>Stack: 压入结果
else CALL
Stack-->>Eval: 弹出参数和callable
Eval->>Call: 递归调用
Call-->>Stack: 压入返回值
end
Eval->>Eval: DISPATCH()
end
Eval-->>Call: 返回结果
深度补充:核心API源码完整剖析
10.1 _PyEval_EvalFrameDefault - 主解释循环
函数签名与职责
// Python/ceval.c (约1023行起)
PyObject* _PyEval_EvalFrameDefault(
PyThreadState *tstate, // 线程状态
_PyInterpreterFrame *frame, // 待执行的栈帧
int throwflag // 是否抛出异常
)
职责:
- 执行字节码指令序列
- 管理值栈和栈帧
- 处理异常和控制流
- 支持性能优化(内联缓存、指令特化、JIT)
完整执行流程源码
PyObject* _PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int throwflag)
{
// 1. 检查递归深度
if (_Py_EnterRecursiveCallTstate(tstate, "")) {
_PyEval_FrameClearAndPop(tstate, frame);
return NULL;
}
// 2. 初始化局部变量(寄存器化优化)
_Py_CODEUNIT *next_instr; // 下一条指令指针
_PyStackRef *stack_pointer; // 值栈指针
// 3. 创建入口帧(用于异常处理)
_PyEntryFrame entry;
entry.frame.previous = tstate->current_frame;
frame->previous = &entry.frame;
tstate->current_frame = frame;
// 4. 检查是否有Tier 2执行器(JIT优化)
#ifdef _Py_TIER2
if (tstate->current_executor != NULL) {
// 执行JIT编译的代码
PyObject *res = _PyJIT_ExecuteTrace(tstate, frame);
if (res != NULL) {
_Py_LeaveRecursiveCallTstate(tstate);
return res;
}
}
#endif
// 5. 初始化指令指针和栈指针
PyCodeObject *co = _PyFrame_GetCode(frame);
next_instr = _PyCode_CODE(co);
stack_pointer = _PyFrame_Stackbase(frame);
// 6. 主解释循环
#if USE_COMPUTED_GOTOS
// 使用computed goto优化(GCC/Clang)
void **opcode_targets = opcode_targets_table;
#define DISPATCH() goto *opcode_targets[opcode]
#else
// 标准switch-case
#define DISPATCH() continue
#endif
for (;;) {
// 6.1 获取当前指令
_Py_CODEUNIT word = *next_instr++;
uint8_t opcode = _Py_OPCODE(word);
int oparg = _Py_OPARG(word);
// 6.2 周期性检查(信号、GIL释放、线程切换)
if (_Py_atomic_load_relaxed_int32(&tstate->interp->ceval.eval_breaker)) {
if (eval_frame_handle_pending(tstate) != 0) {
goto error;
}
}
// 6.3 指令分发
switch (opcode) {
// === 变量加载指令 ===
case LOAD_FAST: {
_PyStackRef value = GETLOCAL(oparg);
if (PyStackRef_IsNull(value)) {
_PyEval_FormatExcCheckArg(tstate, PyExc_UnboundLocalError,
UNBOUNDLOCAL_ERROR_MSG,
PyTuple_GetItem(co->co_localsplusnames, oparg));
goto error;
}
stack_pointer[0] = PyStackRef_Dup(value);
stack_pointer++;
DISPATCH();
}
case LOAD_CONST: {
PyObject *value = PyTuple_GET_ITEM(co->co_consts, oparg);
stack_pointer[0] = PyStackRef_FromPyObjectNew(value);
stack_pointer++;
DISPATCH();
}
case LOAD_GLOBAL: {
// 支持内联缓存优化
PyObject *name = PyTuple_GET_ITEM(co->co_names, oparg);
PyObject *v = NULL;
// 尝试从内联缓存加载
_Py_CODEUNIT *cache = next_instr;
uint32_t version = cache[0].cache;
if (version == GLOBALS()->ma_version_tag) {
// 缓存命中
v = (PyObject *)cache[1].cache;
Py_INCREF(v);
}
else {
// 缓存未命中,从字典查找
v = PyDict_GetItemWithError(GLOBALS(), name);
if (v == NULL) {
v = PyDict_GetItemWithError(BUILTINS(), name);
if (v == NULL) {
_PyEval_FormatExcCheckArg(tstate, PyExc_NameError,
NAME_ERROR_MSG, name);
goto error;
}
}
Py_INCREF(v);
// 更新缓存
cache[0].cache = GLOBALS()->ma_version_tag;
cache[1].cache = (uintptr_t)v;
}
stack_pointer[0] = PyStackRef_FromPyObjectSteal(v);
stack_pointer++;
next_instr += 2; // 跳过缓存条目
DISPATCH();
}
// === 变量存储指令 ===
case STORE_FAST: {
_PyStackRef v = stack_pointer[-1];
SETLOCAL(oparg, v);
stack_pointer--;
DISPATCH();
}
case STORE_GLOBAL: {
PyObject *name = PyTuple_GET_ITEM(co->co_names, oparg);
_PyStackRef v = stack_pointer[-1];
int err = PyDict_SetItem(GLOBALS(), name, PyStackRef_AsPyObjectBorrow(v));
PyStackRef_CLOSE(v);
stack_pointer--;
if (err != 0) goto error;
DISPATCH();
}
// === 二元运算指令 ===
case BINARY_OP: {
_PyStackRef rhs = stack_pointer[-1];
_PyStackRef lhs = stack_pointer[-2];
PyObject *res;
// 根据oparg选择操作
switch (oparg) {
case NB_ADD:
res = PyNumber_Add(
PyStackRef_AsPyObjectBorrow(lhs),
PyStackRef_AsPyObjectBorrow(rhs)
);
break;
case NB_SUBTRACT:
res = PyNumber_Subtract(
PyStackRef_AsPyObjectBorrow(lhs),
PyStackRef_AsPyObjectBorrow(rhs)
);
break;
case NB_MULTIPLY:
res = PyNumber_Multiply(
PyStackRef_AsPyObjectBorrow(lhs),
PyStackRef_AsPyObjectBorrow(rhs)
);
break;
// ... 其他操作
}
PyStackRef_CLOSE(lhs);
PyStackRef_CLOSE(rhs);
if (res == NULL) goto error;
stack_pointer[-2] = PyStackRef_FromPyObjectSteal(res);
stack_pointer--;
DISPATCH();
}
// === 函数调用指令 ===
case CALL: {
int total_args = oparg;
_PyStackRef *args = stack_pointer - total_args - 1;
_PyStackRef callable_ref = args[-1];
PyObject *callable = PyStackRef_AsPyObjectBorrow(callable_ref);
// 快速路径:Python函数
if (PyFunction_Check(callable)) {
PyFunctionObject *func = (PyFunctionObject *)callable;
PyCodeObject *func_code = (PyCodeObject *)func->func_code;
// 创建新栈帧
_PyInterpreterFrame *new_frame = _PyEvalFramePushAndInit(
tstate, func, NULL, args, total_args, NULL
);
if (new_frame == NULL) {
goto error;
}
// 保存当前状态
_PyFrame_SetStackPointer(frame, stack_pointer);
frame->instr_ptr = next_instr;
// 切换到新栈帧
frame = new_frame;
tstate->current_frame = frame;
// 重新加载寄存器
co = func_code;
next_instr = _PyCode_CODE(co);
stack_pointer = _PyFrame_Stackbase(frame);
goto start_frame;
}
// 慢速路径:使用通用调用
PyObject *res = _PyObject_Vectorcall(
callable,
(PyObject **)args,
total_args,
NULL
);
// 清理参数
for (int i = 0; i <= total_args; i++) {
PyStackRef_CLOSE(args[i - 1]);
}
if (res == NULL) goto error;
stack_pointer = args - 1;
stack_pointer[0] = PyStackRef_FromPyObjectSteal(res);
stack_pointer++;
DISPATCH();
}
// === 控制流指令 ===
case RETURN_VALUE: {
_PyStackRef retval = stack_pointer[-1];
stack_pointer--;
// 恢复上一个栈帧
_PyInterpreterFrame *prev = frame->previous;
_PyEvalFrameClearAndPop(tstate, frame);
if (prev == &entry.frame) {
// 返回到C调用者
_Py_LeaveRecursiveCallTstate(tstate);
return PyStackRef_AsPyObjectSteal(retval);
}
// 返回到Python调用者
frame = prev;
tstate->current_frame = frame;
// 恢复寄存器
co = _PyFrame_GetCode(frame);
next_instr = frame->instr_ptr;
stack_pointer = _PyFrame_GetStackPointer(frame);
// 将返回值压栈
stack_pointer[0] = retval;
stack_pointer++;
DISPATCH();
}
case POP_JUMP_IF_FALSE: {
_PyStackRef cond = stack_pointer[-1];
int err = PyObject_IsTrue(PyStackRef_AsPyObjectBorrow(cond));
PyStackRef_CLOSE(cond);
stack_pointer--;
if (err > 0) {
// 条件为真,继续执行
}
else if (err == 0) {
// 条件为假,跳转
next_instr = _PyCode_CODE(co) + oparg;
}
else {
// 发生错误
goto error;
}
DISPATCH();
}
// === 异常处理指令 ===
case RAISE_VARARGS: {
PyObject *cause = NULL, *exc = NULL;
switch (oparg) {
case 2:
cause = PyStackRef_AsPyObjectSteal(stack_pointer[-1]);
// fallthrough
case 1:
exc = PyStackRef_AsPyObjectSteal(stack_pointer[-2]);
break;
default:
PyErr_SetString(PyExc_SystemError, "bad RAISE_VARARGS oparg");
goto error;
}
stack_pointer -= oparg;
if (do_raise(tstate, exc, cause) < 0) {
goto exception_unwind;
}
goto error;
}
default:
// 未知指令
PyErr_Format(PyExc_SystemError,
"unknown opcode %d", opcode);
goto error;
}
start_frame:
// 新栈帧开始执行
if (_Py_EnterRecursivePy(tstate) < 0) {
goto exit_unwind;
}
DISPATCH();
error:
// 错误处理
exception_unwind:
{
// 查找异常处理器
while (frame != &entry.frame) {
if (_PyFrame_IsEntryFrame(frame)) {
break;
}
PyObject *exc = _PyErr_GetRaisedException(tstate);
_Py_CODEUNIT *handler = _PyFrame_GetExceptHandler(frame);
if (handler != NULL) {
// 找到异常处理器
next_instr = handler;
stack_pointer = _PyFrame_Stackbase(frame);
// 压入异常对象
stack_pointer[0] = PyStackRef_FromPyObjectSteal(exc);
stack_pointer++;
DISPATCH();
}
// 当前帧无处理器,返回上一帧
_PyInterpreterFrame *prev = frame->previous;
_PyEvalFrameClearAndPop(tstate, frame);
frame = prev;
tstate->current_frame = frame;
_PyErr_SetRaisedException(tstate, exc);
}
// 未找到处理器,传播异常
goto exit_unwind;
}
exit_unwind:
// 清理并退出
_Py_LeaveRecursiveCallTstate(tstate);
return NULL;
}
}
关键数据结构详解
_PyInterpreterFrame结构
// Include/internal/pycore_interpframe_structs.h
struct _PyInterpreterFrame {
// 可执行对象(代码对象或None)
_PyStackRef f_executable;
// 指向上一个栈帧
struct _PyInterpreterFrame *previous;
// 函数对象(延迟或强引用)
_PyStackRef f_funcobj;
// 全局变量字典(借用引用)
PyObject *f_globals;
// 内置变量字典(借用引用)
PyObject *f_builtins;
// 局部变量字典(强引用,可为NULL)
PyObject *f_locals;
// 帧对象(用于调试)
PyFrameObject *frame_obj;
// 当前执行的指令指针
_Py_CODEUNIT *instr_ptr;
// 值栈指针
_PyStackRef *stackpointer;
#ifdef Py_GIL_DISABLED
// 线程本地字节码索引
int32_t tlbc_index;
#endif
// 返回偏移量
uint16_t return_offset;
// 帧所有者标志
char owner;
// 访问标志
uint8_t visited;
// 局部变量和值栈(可变长度数组)
_PyStackRef localsplus[1];
};
10.2 指令分发机制优化
Computed Goto vs Switch-Case
// Python/ceval.c
#if USE_COMPUTED_GOTOS
// GCC/Clang支持的computed goto优化
#include "opcode_targets.h"
// 跳转表
static void *opcode_targets[256] = {
&&TARGET_LOAD_FAST,
&&TARGET_LOAD_CONST,
&&TARGET_STORE_FAST,
// ...
};
#define DISPATCH() goto *opcode_targets[opcode]
// 主循环
dispatch_opcode:
opcode = *next_instr++;
goto *opcode_targets[opcode];
TARGET_LOAD_FAST:
// 处理LOAD_FAST
DISPATCH();
TARGET_LOAD_CONST:
// 处理LOAD_CONST
DISPATCH();
#else
// 标准switch-case
#define DISPATCH() goto dispatch_opcode
dispatch_opcode:
opcode = *next_instr++;
switch (opcode) {
case LOAD_FAST:
// 处理
DISPATCH();
case LOAD_CONST:
// 处理
DISPATCH();
}
#endif
性能对比:
- Computed Goto: ~10-15%性能提升
- 原因:减少分支预测失败,更好的指令缓存局部性
10.3 内联缓存机制
全局变量缓存
// Python/ceval.c - LOAD_GLOBAL优化
case LOAD_GLOBAL: {
// 缓存布局:
// [version_tag] [cached_value]
_Py_CODEUNIT *cache = next_instr;
uint32_t version = cache[0].cache;
PyObject *name = PyTuple_GET_ITEM(co->co_names, oparg);
PyDictObject *globals = (PyDictObject *)frame->f_globals;
// 检查版本
if (version == globals->ma_version_tag) {
// 缓存命中
PyObject *value = (PyObject *)cache[1].cache;
Py_INCREF(value);
PUSH(PyStackRef_FromPyObjectNew(value));
next_instr += 2; // 跳过缓存
DISPATCH();
}
// 缓存未命中,常规查找
PyObject *value = _PyDict_LoadGlobal(globals, BUILTINS(), name);
if (value == NULL) {
goto error;
}
// 更新缓存
cache[0].cache = globals->ma_version_tag;
cache[1].cache = (uintptr_t)value;
PUSH(PyStackRef_FromPyObjectNew(value));
next_instr += 2;
DISPATCH();
}
属性访问缓存
// Python/ceval.c - LOAD_ATTR优化
case LOAD_ATTR: {
// 缓存布局:
// [version_tag] [dict_offset] [index]
_Py_CODEUNIT *cache = next_instr;
_PyStackRef owner_ref = stack_pointer[-1];
PyObject *owner = PyStackRef_AsPyObjectBorrow(owner_ref);
PyTypeObject *tp = Py_TYPE(owner);
// 检查类型版本
if (cache[0].cache == tp->tp_version_tag) {
// 缓存命中 - 直接从字典获取
PyDictObject *dict = *(PyDictObject **)
((char *)owner + cache[1].cache);
if (dict != NULL) {
PyDictKeysObject *keys = dict->ma_keys;
Py_ssize_t index = cache[2].cache;
if (index < keys->dk_nentries) {
PyObject *value = keys->dk_entries[index].me_value;
if (value != NULL) {
Py_INCREF(value);
stack_pointer[-1] = PyStackRef_FromPyObjectSteal(value);
PyStackRef_CLOSE(owner_ref);
next_instr += 3;
DISPATCH();
}
}
}
}
// 缓存未命中或过期,使用常规路径
PyObject *name = PyTuple_GET_ITEM(co->co_names, oparg);
PyObject *res = PyObject_GetAttr(owner, name);
PyStackRef_CLOSE(owner_ref);
if (res == NULL) goto error;
// 更新缓存(如果可能)
if (PyType_HasFeature(tp, Py_TPFLAGS_VALID_VERSION_TAG)) {
cache[0].cache = tp->tp_version_tag;
// 设置dict_offset和index...
}
stack_pointer[-1] = PyStackRef_FromPyObjectSteal(res);
next_instr += 3;
DISPATCH();
}
10.4 完整UML类图
解释器核心类图
classDiagram
class PyThreadState {
+PyInterpreterState* interp
+_PyInterpreterFrame* current_frame
+int py_recursion_remaining
+int recursion_headroom
+PyObject* current_exception
+eval_breaker
+_Py_Executor* current_executor
}
class _PyInterpreterFrame {
+_PyStackRef f_executable
+_PyInterpreterFrame* previous
+_PyStackRef f_funcobj
+PyObject* f_globals
+PyObject* f_builtins
+PyObject* f_locals
+PyFrameObject* frame_obj
+_Py_CODEUNIT* instr_ptr
+_PyStackRef* stackpointer
+uint16_t return_offset
+char owner
+_PyStackRef localsplus[1]
}
class PyCodeObject {
+PyObject* co_consts
+PyObject* co_names
+PyBytes* co_code
+int co_argcount
+int co_nlocalsplus
+int co_stacksize
+int co_firstlineno
+PyObject* co_linetable
+uint32_t co_version
+_PyExecutorArray* co_executors
}
class _PyStackRef {
+uintptr_t bits
+PyObject_AsPyObjectBorrow()
+PyStackRef_Dup()
+PyStackRef_CLOSE()
}
class _Py_CODEUNIT {
+uint8_t opcode
+uint8_t oparg
+uint32_t cache
}
class PyFrameObject {
+PyFrameObject* f_back
+_PyInterpreterFrame* f_frame
+PyObject* f_trace
+int f_lineno
+PyObject* f_extra_locals
}
class _Py_Executor {
+uintptr_t execute
+_PyExecutorLinkListNode links
+_PyBloomFilter vm_data
+uint16_t exit_count
}
PyThreadState "1" --> "1" _PyInterpreterFrame: current_frame
_PyInterpreterFrame "1" --> "0..1" _PyInterpreterFrame: previous
_PyInterpreterFrame "1" --> "1" PyCodeObject: f_executable
_PyInterpreterFrame "1" --> "0..1" PyFrameObject: frame_obj
_PyInterpreterFrame "1" --> "*" _PyStackRef: localsplus/stack
PyCodeObject "1" --> "*" _Py_CODEUNIT: co_code
PyCodeObject "1" --> "0..*" _Py_Executor: co_executors
PyThreadState "1" --> "0..1" _Py_Executor: current_executor
PyFrameObject "1" --> "1" _PyInterpreterFrame: f_frame
值栈布局UML
classDiagram
class FrameLayout {
<<structure>>
+_PyInterpreterFrame header
+_PyStackRef[co_nlocalsplus] locals
+_PyStackRef[co_stacksize] stack
}
class LocalsSection {
<<section>>
+_PyStackRef[co_nlocals] positional_args
+_PyStackRef[co_cellvars] cell_vars
+_PyStackRef[co_freevars] free_vars
}
class StackSection {
<<section>>
+_PyStackRef[] evaluation_stack
+stackpointer: current_top
}
FrameLayout "1" *-- "1" LocalsSection: localsplus
FrameLayout "1" *-- "1" StackSection: stack
指令缓存UML
classDiagram
class InlineCache {
<<abstract>>
+uint32_t counter
+validate() bool
+update()
}
class LoadGlobalCache {
+uint32_t version_tag
+PyObject* cached_value
+validate() bool
+update(globals, builtins, name)
}
class LoadAttrCache {
+uint32_t type_version
+Py_ssize_t dict_offset
+Py_ssize_t index
+validate() bool
+update(owner, name)
}
class BinaryOpCache {
+uint32_t left_type_version
+uint32_t right_type_version
+binaryfunc func_ptr
+validate() bool
+update(left, right)
}
class CallCache {
+uint32_t func_version
+int co_argcount
+PyCodeObject* code
+validate() bool
+update(callable)
}
InlineCache <|-- LoadGlobalCache
InlineCache <|-- LoadAttrCache
InlineCache <|-- BinaryOpCache
InlineCache <|-- CallCache
10.5 详细时序图
完整函数调用时序
sequenceDiagram
autonumber
participant Caller as 调用者帧
participant Eval as _PyEval_EvalFrameDefault
participant Stack as 值栈
participant Frame as _PyInterpreterFrame
participant Callee as 被调用帧
participant Code as PyCodeObject
Caller->>Eval: 执行CALL指令
Eval->>Stack: 弹出参数和callable
Stack-->>Eval: args[], callable
alt Python函数快速路径
Eval->>Code: 获取func_code
Code-->>Eval: PyCodeObject
Eval->>Frame: _PyEvalFramePushAndInit()
Frame->>Frame: 分配新栈帧
Frame->>Frame: 初始化localsplus
Frame->>Frame: 复制参数
Frame-->>Eval: new_frame
Eval->>Eval: 保存当前状态
Note over Eval: frame->instr_ptr = next_instr<br/>stack_pointer保存
Eval->>Callee: 切换到新帧
Note over Eval,Callee: frame = new_frame<br/>tstate->current_frame = frame
Callee->>Code: 加载字节码
Code-->>Callee: _PyCode_CODE(co)
Callee->>Callee: 初始化next_instr
Callee->>Callee: 初始化stack_pointer
Callee->>Callee: 执行函数体
Note over Callee: for (;;) { DISPATCH() }
Callee->>Callee: RETURN_VALUE指令
Callee->>Stack: 获取返回值
Stack-->>Callee: retval
Callee->>Frame: _PyEvalFrameClearAndPop()
Frame->>Frame: 清理localsplus
Frame->>Frame: 释放资源
Callee->>Caller: 恢复调用者帧
Note over Callee,Caller: frame = prev<br/>tstate->current_frame = frame
Caller->>Caller: 恢复寄存器
Note over Caller: next_instr = frame->instr_ptr<br/>stack_pointer恢复
Caller->>Stack: 压入返回值
Stack->>Caller: 继续执行
else C函数慢速路径
Eval->>Eval: _PyObject_Vectorcall()
Note over Eval: 调用tp_vectorcall
Eval->>Stack: 清理参数
Eval->>Stack: 压入返回值
Eval->>Caller: 继续执行
end
异常处理完整时序
sequenceDiagram
autonumber
participant Code as 用户代码
participant Eval as _PyEval_EvalFrameDefault
participant Frame as 当前帧
participant ExcTable as 异常表
participant Handler as 异常处理器
participant Unwinder as 栈展开器
Code->>Eval: 执行指令引发异常
Eval->>Eval: 检测到error标志
Eval->>Frame: 获取当前位置
Frame-->>Eval: instr_ptr偏移
Eval->>ExcTable: 查找异常处理器
Note over ExcTable: co_exceptiontable<br/>二分查找
alt 找到处理器
ExcTable-->>Eval: handler_offset
Eval->>Eval: 跳转到处理器
Note over Eval: next_instr = handler
Eval->>Frame: 清理值栈到处理点
Frame->>Frame: 弹出并关闭引用
Eval->>Frame: 压入异常对象
Note over Frame: PUSH(exc_type)<br/>PUSH(exc_value)<br/>PUSH(exc_traceback)
Handler->>Eval: 执行except块
Eval->>Code: 继续执行
else 未找到处理器
Eval->>Unwinder: exception_unwind
loop 遍历调用栈
Unwinder->>Frame: 检查previous帧
alt 是entry_frame
Unwinder->>Unwinder: 退出循环
else 普通帧
Unwinder->>ExcTable: 查找处理器
alt 找到
ExcTable-->>Unwinder: handler_offset
Unwinder->>Frame: 切换到该帧
Unwinder->>Eval: 跳转到处理器
Eval->>Code: 继续执行
else 未找到
Unwinder->>Frame: _PyEvalFrameClearAndPop()
Unwinder->>Frame: 切换到previous
Unwinder->>Unwinder: 继续循环
end
end
end
Unwinder->>Eval: exit_unwind
Eval->>Code: 返回NULL(异常传播)
end
内联缓存工作流程
sequenceDiagram
autonumber
participant Instr as 指令
participant Cache as 内联缓存
participant Dict as 字典
participant Type as 类型对象
participant Result as 结果
Instr->>Cache: 读取缓存条目
Cache-->>Instr: version, cached_data
Instr->>Type: 获取当前版本
Type-->>Instr: current_version
alt 版本匹配
Instr->>Instr: 缓存命中!
Note over Instr: cache_hit++
Instr->>Cache: 读取缓存的值
Cache-->>Result: cached_value
Result->>Instr: 直接使用(快速路径)
Note over Result: 跳过字典查找<br/>节省~50-70%时间
else 版本不匹配
Instr->>Instr: 缓存未命中
Note over Instr: cache_miss++
Instr->>Dict: 常规查找
Dict-->>Result: value
Instr->>Type: 检查是否可缓存
alt 可以缓存
Instr->>Cache: 更新缓存
Note over Cache: cache[0] = new_version<br/>cache[1] = value<br/>cache[2] = metadata
Cache-->>Instr: 缓存已更新
else 无法缓存
Instr->>Instr: 禁用缓存
Note over Instr: 标记为不可缓存<br/>避免后续尝试
end
Result->>Instr: 使用查找结果
end
Note over Instr,Result: 缓存命中率统计<br/>用于适应性优化决策
10.6 完整函数调用链
Python程序启动到字节码执行
main() // Programs/python.c:17
└─> Py_BytesMain() // Modules/main.c:695
└─> pymain_main() // Modules/main.c:672
└─> Py_RunMain() // Modules/main.c:590
└─> pymain_run_python() // Modules/main.c:518
└─> pymain_run_file() // Modules/main.c:380
└─> _PyRun_SimpleFileObject() // Python/pythonrun.c:387
└─> PyRun_FileExFlags() // Python/pythonrun.c:1127
└─> _PyRun_InteractiveLoopObject() // Python/pythonrun.c:267
└─> PyRun_InteractiveOneObjectEx() // Python/pythonrun.c:227
└─> PyRun_InteractiveOneFlags() // Python/pythonrun.c:188
└─> _PyRun_AnyFileObject() // Python/pythonrun.c:88
└─> _Py_SourceAsString() // Python/pythonrun.c:1018
└─> PyParser_ASTFromFileObject() // Parser/peg_api.c:87
└─> _PyPegen_run_parser_from_file() // Parser/pegen.c:598
└─> run_mod() // Python/pythonrun.c:1145
└─> PyEval_EvalCode() // Python/ceval.c:854
└─> _PyEval_Vector() // Python/ceval.c:1948
└─> _PyEval_EvalFrameDefault() // Python/ceval.c:1023
└─> [主解释循环]
CALL指令完整调用链
_PyEval_EvalFrameDefault() // Python/ceval.c:1023
└─> case CALL: // Python/bytecodes.c:3521
├─> [检查callable类型]
│
├─> [快速路径: Python函数]
│ └─> _PyEvalFramePushAndInit() // Python/ceval.c:1823
│ ├─> _PyThreadState_BumpFrameDepth() // Python/ceval.c:1731
│ ├─> _PyFrame_PushUnchecked() // Python/ceval.c:1774
│ │ └─> _PyFrame_Initialize() // Python/ceval.c:1647
│ │ ├─> _PyFrame_SetStackPointer() // Include/internal/pycore_frame.h:62
│ │ └─> _PyFrame_GetLocalsArray() // Include/internal/pycore_frame.h:45
│ └─> _PyFunction_CopyParameters() // Python/ceval.c:1544
│ ├─> 复制位置参数
│ ├─> 处理默认值
│ └─> 处理关键字参数
│
├─> [慢速路径: C函数或可调用对象]
│ └─> _PyObject_Vectorcall() // Objects/call.c:263
│ ├─> [检查tp_vectorcall]
│ ├─> PyVectorcall_NARGS() // Include/cpython/abstract.h:121
│ └─> (*tp_vectorcall)(callable, args, nargsf, kwnames)
│ ├─> [内置函数]
│ │ └─> cfunction_vectorcall_*() // Objects/methodobject.c:465
│ │ └─> PyCFunction_Call() // Objects/methodobject.c:260
│ │
│ ├─> [实例方法]
│ │ └─> method_vectorcall() // Objects/classobject.c:53
│ │ └─> _PyObject_Vectorcall() // [递归]
│ │
│ └─> [类实例(__call__)]
│ └─> PyObject_Call() // Objects/call.c:290
│ └─> _PyObject_MakeTpCall() // Objects/call.c:215
│ └─> type->tp_call()
│
└─> [返回值处理]
├─> [清理参数]
│ └─> PyStackRef_CLOSE() // Include/internal/pycore_stackref.h:89
│ └─> Py_DECREF() // Include/object.h:604
│
└─> [压入返回值到栈]
└─> PUSH(result)
属性访问完整调用链
_PyEval_EvalFrameDefault() // Python/ceval.c:1023
└─> case LOAD_ATTR: // Python/bytecodes.c:1754
├─> [检查内联缓存]
│ └─> 缓存命中 → 直接返回
│
└─> [缓存未命中]
└─> PyObject_GetAttr() // Objects/object.c:1019
└─> _PyObject_GetAttr() // Objects/object.c:985
├─> type->tp_getattro() // 类型特定的获取方法
│ └─> PyObject_GenericGetAttr() // Objects/object.c:1370
│ ├─> _PyType_Lookup() // Objects/typeobject.c:4117
│ │ └─> [MRO遍历查找]
│ │ └─> PyDict_GetItemWithError() // Objects/dictobject.c:1824
│ │
│ ├─> [检查描述符]
│ │ ├─> descr_get = Py_TYPE(descr)->tp_descr_get
│ │ └─> descr_get(descr, obj, type) // 调用__get__
│ │
│ ├─> [检查实例__dict__]
│ │ └─> _PyObjectDict_GetItemWithError()
│ │
│ └─> [返回类属性]
│ └─> Py_NewRef(descr)
│
└─> [特殊方法]
└─> _PyType_LookupId() // Objects/typeobject.c:4161
└─> _PyType_Lookup() // [同上]
异常传播调用链
[指令执行错误]
└─> goto error; // Python/ceval.c
└─> error: // Python/ceval.c
└─> exception_unwind: // Python/ceval.c
│
├─> _PyErr_Occurred() // Python/errors.c:87
│ └─> tstate->current_exception != NULL
│
└─> [展开栈帧循环]
├─> _PyFrame_GetExceptHandler() // Python/ceval.c:1234
│ └─> get_exception_handler() // Python/ceval.c:1187
│ └─> _PyCode_GetCode() // Objects/codeobject.c:1745
│ └─> co->co_exceptiontable // 异常表二分查找
│
├─> [找到处理器]
│ └─> 跳转到handler
│ └─> PUSH(exc) // 压入异常对象
│ └─> 继续执行
│
└─> [未找到处理器]
└─> _PyEvalFrameClearAndPop() // Python/ceval.c:1266
├─> _PyFrame_ClearLocals() // Python/ceval.c:1288
│ └─> PyStackRef_CLOSE() // 清理局部变量
│
├─> frame = frame->previous // 返回上一帧
│
└─> [继续展开或退出]
└─> exit_unwind:
└─> return NULL // 异常传播到C层
10.7 解释器架构图
整体执行架构
flowchart TB
Start([程序启动]) --> Init[Py_Initialize]
Init --> Main[Py_Main]
Main --> Parse[解析源码]
Parse --> Tokenize[词法分析]
Tokenize --> PEG[PEG解析]
PEG --> AST[AST生成]
AST --> Compile[编译]
Compile --> SymTable[符号表]
SymTable --> CodeGen[代码生成]
CodeGen --> CFG[CFG优化]
CFG --> Assemble[字节码汇编]
Assemble --> CodeObj[PyCodeObject]
CodeObj --> Eval[PyEval_EvalCode]
Eval --> CreateFrame[创建栈帧]
CreateFrame --> EvalDefault[_PyEval_EvalFrameDefault]
EvalDefault --> CheckTier2{检查Tier2优化}
CheckTier2 -->|有JIT代码| JIT[_PyJIT_ExecuteTrace]
CheckTier2 -->|无JIT| Interpreter[字节码解释器]
JIT --> JITResult{执行结果}
JITResult -->|成功| Return[返回结果]
JITResult -->|失败/退出| Interpreter
Interpreter --> FetchInstr[取指令]
FetchInstr --> CheckBreaker{eval_breaker?}
CheckBreaker -->|是| HandlePending[处理挂起事件]
HandlePending --> Dispatch
CheckBreaker -->|否| Dispatch[指令分发]
Dispatch --> LoadInstr{指令类型}
LoadInstr -->|LOAD_*| LoadOps[加载操作]
LoadInstr -->|STORE_*| StoreOps[存储操作]
LoadInstr -->|BINARY_OP| BinaryOps[二元运算]
LoadInstr -->|CALL| CallOps[函数调用]
LoadInstr -->|RETURN| ReturnOps[返回操作]
LoadInstr -->|JUMP| JumpOps[跳转操作]
LoadInstr -->|其他| OtherOps[其他操作]
LoadOps --> CheckCache1{有缓存?}
CheckCache1 -->|是| CacheHit[缓存命中]
CheckCache1 -->|否| SlowPath[慢速路径]
CacheHit --> NextInstr
SlowPath --> UpdateCache[更新缓存]
UpdateCache --> NextInstr
StoreOps --> NextInstr[next_instr++]
BinaryOps --> NextInstr
JumpOps --> NextInstr
OtherOps --> NextInstr
CallOps --> CallType{调用类型}
CallType -->|Python函数| FastCall[快速调用]
CallType -->|C函数| SlowCall[通用调用]
FastCall --> NewFrame[创建新栈帧]
NewFrame --> SwitchFrame[切换栈帧]
SwitchFrame --> FetchInstr
SlowCall --> Vectorcall[_PyObject_Vectorcall]
Vectorcall --> NextInstr
ReturnOps --> PopFrame{有上一帧?}
PopFrame -->|是| RestoreFrame[恢复栈帧]
PopFrame -->|否| Return
RestoreFrame --> PushResult[压入返回值]
PushResult --> NextInstr
NextInstr --> ErrorCheck{有错误?}
ErrorCheck -->|是| ErrorHandler[异常处理]
ErrorCheck -->|否| FetchInstr
ErrorHandler --> FindHandler{找到处理器?}
FindHandler -->|是| HandleExc[执行except块]
FindHandler -->|否| UnwindStack[展开栈]
HandleExc --> NextInstr
UnwindStack --> PopFrame
Return --> Cleanup[清理资源]
Cleanup --> End([程序结束])
style Start fill:#90EE90
style End fill:#FFB6C1
style EvalDefault fill:#87CEEB
style JIT fill:#FFD700
style Interpreter fill:#87CEEB
style ErrorHandler fill:#FF6347
栈帧管理架构
flowchart TB
subgraph ThreadState["PyThreadState"]
CurrentFrame[current_frame]
PyRecursion[py_recursion_remaining]
Executor[current_executor]
end
subgraph FrameStack["栈帧链表"]
direction TB
Frame1["Frame N (current)"]
Frame2["Frame N-1"]
Frame3["Frame N-2"]
EntryFrame["Entry Frame"]
Frame1 -->|previous| Frame2
Frame2 -->|previous| Frame3
Frame3 -->|previous| EntryFrame
end
subgraph SingleFrame["_PyInterpreterFrame"]
direction TB
Executable[f_executable: 代码对象]
FuncObj[f_funcobj: 函数对象]
Globals[f_globals: 全局变量]
Builtins[f_builtins: 内置变量]
InstrPtr[instr_ptr: 指令指针]
StackPtr[stackpointer: 栈指针]
subgraph Localsplus["localsplus数组"]
direction LR
Args["参数<br/>[0..co_nargs]"]
CellVars["闭包变量<br/>[nargs..ncells]"]
FreeVars["自由变量<br/>[ncells..nfrees]"]
EvalStack["值栈<br/>[nlocals..]"]
end
end
subgraph CodeObject["PyCodeObject"]
Code[co_code: 字节码]
Consts[co_consts: 常量]
Names[co_names: 名字]
ExcTable[co_exceptiontable: 异常表]
Executors[co_executors: JIT执行器]
end
CurrentFrame -.指向.-> Frame1
Frame1 -.引用.-> Executable
Frame1 -.引用.-> FuncObj
Executable -.实际是.-> CodeObject
InstrPtr -.指向.-> Code
StackPtr -.指向.-> EvalStack
style ThreadState fill:#FFE4B5
style FrameStack fill:#E0FFFF
style SingleFrame fill:#F0E68C
style CodeObject fill:#DDA0DD
指令执行流程架构
flowchart LR
subgraph Input["输入"]
InstrWord["_Py_CODEUNIT<br/>指令字"]
end
subgraph Decode["解码"]
Opcode["opcode<br/>(8 bits)"]
Oparg["oparg<br/>(8 bits)"]
end
subgraph Dispatch["分发机制"]
direction TB
ComputedGoto["Computed Goto<br/>(GCC/Clang)"]
SwitchCase["Switch-Case<br/>(其他编译器)"]
end
subgraph Execute["执行"]
direction TB
subgraph LoadGroup["加载指令"]
LoadFast["LOAD_FAST<br/>局部变量"]
LoadConst["LOAD_CONST<br/>常量"]
LoadGlobal["LOAD_GLOBAL<br/>全局变量<br/>(有缓存)"]
LoadAttr["LOAD_ATTR<br/>属性<br/>(有缓存)"]
end
subgraph StoreGroup["存储指令"]
StoreFast["STORE_FAST<br/>局部变量"]
StoreGlobal["STORE_GLOBAL<br/>全局变量"]
StoreAttr["STORE_ATTR<br/>属性"]
end
subgraph OpGroup["运算指令"]
BinaryOp["BINARY_OP<br/>二元运算"]
UnaryOp["UNARY_OP<br/>一元运算"]
CompareOp["COMPARE_OP<br/>比较运算"]
end
subgraph ControlGroup["控制流"]
Jump["JUMP_*<br/>跳转"]
Call["CALL<br/>函数调用"]
Return["RETURN_VALUE<br/>返回"]
end
end
subgraph Stack["值栈操作"]
Push["PUSH<br/>压栈"]
Pop["POP<br/>出栈"]
Peek["PEEK<br/>查看"]
end
subgraph Cache["内联缓存"]
Version["版本检查"]
CacheHit["缓存命中"]
CacheMiss["缓存未命中"]
Update["更新缓存"]
end
InstrWord --> Opcode
InstrWord --> Oparg
Opcode --> ComputedGoto
Opcode --> SwitchCase
ComputedGoto --> Execute
SwitchCase --> Execute
LoadFast --> Stack
LoadConst --> Stack
LoadGlobal --> Cache
LoadAttr --> Cache
Cache --> Version
Version -->|匹配| CacheHit
Version -->|不匹配| CacheMiss
CacheHit --> Stack
CacheMiss --> Update
Update --> Stack
StoreFast --> Stack
StoreGlobal --> Stack
BinaryOp --> Stack
UnaryOp --> Stack
Call --> NewFrame["创建新栈帧"]
Return --> RestoreFrame["恢复栈帧"]
Stack --> NextInstr["next_instr++"]
NewFrame --> NextInstr
RestoreFrame --> NextInstr
style Input fill:#90EE90
style Decode fill:#FFD700
style Dispatch fill:#87CEEB
style Execute fill:#DDA0DD
style Stack fill:#F0E68C
style Cache fill:#FFB6C1