python的subprocess的简单使用和注意事项

浏览数：28 / 时间：2015年06月08日

subprocess是python在2.4引入的模块, 主要用来替代下面几个模块和方法:

os.system
os.spawn*
os.popen*
popen2.*
commands.*

可以参考PEP324: http://legacy.python.org/dev/peps/pep-0324/

这是一个用来调用外部命令的模块, 替代了一些旧的模块, 提供了更加友好统一的接口.

三个封装方法

使用下面三个方法的时候, 注意两个问题: 1. shell=True或False, 两种解析方式是不同的 2. 注意PIPE的使用, 可能导致卡死

subprocess.call 运行命令, 等待完成, 并返回returncode

subprocess.check_call 运行命令, 等待完成, 如果返回值为0, 则返回returncode, 否则抛出带有returncode的CalledPorcessError异常.

subprocess.check_output 和check_call类似, 会检查返回值是否为0, 返回stdout.

卡死常见的原因

这个模块在使用的时候, 可能会出现子进程卡死或hang住的情况. 一般出现这种情况的是这样的用法.

import 
subprocess
import 
shlex
 
proc = 
subprocess.Popen(shlex.split(cmd), stdin=subprocess.PIPE,
                        stdout=subprocess.PIPE, stderr=subprocess.PIPE,
                        shell=False, universal_newlines=True)
print 
proc.stdout.read()

这里的直接读取了Popen对象的stdout, 使用了subprocess.PIPE. 这种情况导致卡死的原因是PIPE管道的缓存被充满了, 无法继续写入, 也没有将缓存中的东西读出来.

官方文档的提示(Popen.wait()方法的Warning)

This will deadlock when using stdout=PIPE and/or stderr=PIPE and the child process generates enough output to a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that.

为了解决这个问题, Popen有一个communicate方法, 这个方法会将stdout和stderr数据读入到内存, 并且会一直独到两个流的结尾(EOF). communicate可以传入一个input参数, 用来表示向stdin中写入数据, 可以做到进程间的数据通信.

注意: 官方文档中提示, 读取的数据是缓存在内存中的, 所以当数据量非常大或者是无限制的时候, 不要使用communicate, 应该会导致OOM.

一般情况下, stdout和stderr数据量不是很大的时候, 使用communicate不会导致问题, 量特别大的时候可以考虑使用文件来替代PIPE, 例如stdout=open("stdout", "w")[参考1].

参考2中给出了另一种解决的思路, 使用select来读取Popen的stdout和stderr中的数据, select的这种用法windows下是不支持的, 不过可以做到比较实时的读取数据.

Popen中的shell参数含义

官方文档中推荐shell=False, 这种方式更加安全, 我们来看一下官方给出的例子.

>>> from 
subprocess import 
call
>>> filename = 
input("What file would you like to display?\n")
What file 
would you like to display?
non_existent; rm -rf / 
#
>>> call("cat " 
+ filename, shell=True) # Uh-oh. This will end badly...