2

程序卡住了?教你如何调试已在运行的程序

 4 years ago
source link: http://mp.weixin.qq.com/s?__biz=MzA5MTkxNTMzNg%3D%3D&%3Bmid=2650266422&%3Bidx=5&%3Bsn=306c05c6a14c51b05d1b927ffb4b3ed3
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

点击上方“ 涛哥聊Python ”,选择“星标”公众号

重磅干货,第一时间送达

zIR3Mbr.jpg!web

来源: https://mozillazg.com/2017/07/debug-running-python-process-with-gdb.html

假设一个服务器上运行了下面这样的 test.py 程序,我们怎样才能知道程序是否在正常运行,运行到哪一步了呢?

import time


def do(x):
    time.sleep(10)


def main():
    for x in range(10000):
        do(x)


if __name__ == '__main__':
    main()

这个程序既没有日志也没有 print 输出,通过查看日志文件/标准输出/标准错误是没有办法确认程序状况的。一种可行的办法就是使用 gdb 来查看程序当前的运行状况。

0. 测试环境

  • 系统: Ubuntu 16.04.1 LTS

  • Python: 2.7.12

1. 准备工作

安装 gdb 和 python2.7-dbg:

$ sudo apt-get install gdb python2.7-dbg

设置 /proc/sys/kernel/yama/ptrace_scope:

$ echo 0 |sudo tee /proc/sys/kernel/yama/ptrace_scope

运行 test.py:

$ python test.py &
[1] 6489

通过 gdb python PID 来调试运行中的进程:

$ gdb python 6489
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
...
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...Reading symbols from /usr/lib/debug/.build-id/90/d1300febaeb0a626baa2540d19df2416cd3361.debug...done.
done.
...
Reading symbols from /lib/ld-linux.so.2...Reading symbols from /usr/lib/debug//lib/i386-linux-gnu/ld-2.23.so...done.
done.
0xb778fc31 in __kernel_vsyscall ()
(gdb)

2. 生成 core file

为了不影响运行中的进程,可以通过生成 core file 的方式来保存进程的当前信息:

(gdb) generate-core-file
warning: target file /proc/6489/cmdline contained unexpected null characters
Saved corefile core.6489
(gdb) quit
A debugging session is active.

    Inferior 1 [process 6489] will be detached.

Quit anyway? (y or n) y

可以通过 gdb python core.PID 的方式来读取 core file:

$ gdb python core.6489
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
...
Type "apropos word" to search for commands related to "word"...
Reading symbols from python...Reading symbols from /usr/lib/debug/.build-id/90/d1300febaeb0a626baa2540d19df2416cd3361.debug...done.
done.

warning: core file may not match specified executable file.
[New LWP 6489]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
Core was generated by `python'.
#0  0xb778fc31 in __kernel_vsyscall ()
(gdb)

3. 可用的 python 相关的命令

可以通过输入 py 然后加 tab 键的方式来查看可用的命令( 往左划动,查看全部 ):

(gdb) py
py-bt               py-down             py-locals           py-up               python-interactive
py-bt-full          py-list             py-print            python

可以通过 help cmd 查看各个命令的说明:

(gdb) help py-bt
Display the current python frame and all the frames within its call stack (if any)

当前执行位置的源码

(gdb) py-list
   1    # -*- coding: utf-8 -*-
   2    import time
   3
   4
   5    def do(x):
  >6        time.sleep(10)
   7
   8
   9    def main():
  10        for x in range(10000):
  11            do(x)
(gdb)

可以看到当前正在执行 time.sleep(10)

当前位置的调用栈

(gdb) py-bt
Traceback (most recent call first):
  <built-in function sleep>
  File "test.py", line 6, in do
    time.sleep(10)
  File "test.py", line 11, in main
    do(x)
  File "test.py", line 15, in <module>
    main()
(gdb)

可以看出来是 main() -> do(x) -> time.sleep(10)

查看变量的值

(gdb) py-list
   1    # -*- coding: utf-8 -*-
   2    import time
   3
   4
   5    def do(x):
  >6        time.sleep(10)
   7
   8
   9    def main():
  10        for x in range(10000):
  11            do(x)
(gdb) py-print x
local 'x' = 12
(gdb)
(gdb) py-locals
x = 12
(gdb)

查看上层调用方的信息

(gdb) py-up
#9 Frame 0xb74c0994, for file test.py, line 11, in main (x=12)
    do(x)
(gdb) py-list
   6        time.sleep(10)
   7
   8
   9    def main():
  10        for x in range(10000):
 >11            do(x)
  12
  13
  14    if __name__ == '__main__':
  15        main()
(gdb) py-print x
local 'x' = 12
(gdb)

可以通过 py-down 回去:

(gdb) py-down
#6 Frame 0xb74926e4, for file test.py, line 6, in do (x=12)
    time.sleep(10)
(gdb) py-list
   1    # -*- coding: utf-8 -*-
   2    import time
   3
   4
   5    def do(x):
  >6        time.sleep(10)
   7
   8
   9    def main():
  10        for x in range(10000):
  11            do(x)
(gdb)

4. 调试多线程程序

测试程序 test2.py:

# -*- coding: utf-8 -*-
from threading import Thread
import time


def do(x):
    x = x * 3
    time.sleep(x * 60)


def main():
    threads = []
    for x in range(1, 3):
        t = Thread(target=do, args=(x,))
        t.start()
    for x in threads:
        x.join()


if __name__ == '__main__':
    main()
$ python test2.py &
[2] 12281

查看所有线程

info threads
$ gdb python core.12281

(gdb) info threads
  Id   Target Id         Frame
* 1    Thread 0xb74b9700 (LWP 11039) 0xb7711c31 in __kernel_vsyscall ()
  2    Thread 0xb73b8b40 (LWP 11040) 0xb7711c31 in __kernel_vsyscall ()
  3    Thread 0xb69ffb40 (LWP 11041) 0xb7711c31 in __kernel_vsyscall ()
(gdb)

可以看到这个程序当前有 3 个线程, 当前进入的是 1 号线程。

切换线程

thread ID
(gdb) thread 3
[Switching to thread 3 (Thread 0xb69ffb40 (LWP 11041))]
#0  0xb7711c31 in __kernel_vsyscall ()
(gdb) info threads
  Id   Target Id         Frame
  1    Thread 0xb74b9700 (LWP 11039) 0xb7711c31 in __kernel_vsyscall ()
  2    Thread 0xb73b8b40 (LWP 11040) 0xb7711c31 in __kernel_vsyscall ()
* 3    Thread 0xb69ffb40 (LWP 11041) 0xb7711c31 in __kernel_vsyscall ()
(gdb)

现在切换到了 3 号线程。

可以通过前面所说的 py- 命令来查看当前线程的其他信息:

[Current thread is 1 (Thread 0xb74b9700 (LWP 11039))]
(gdb) py-list
 335            waiter.acquire()
 336            self.__waiters.append(waiter)
 337            saved_state = self._release_save()
 338            try:    # restore state no matter what (e.g., KeyboardInterrupt)
 339                if timeout is None:
>340                    waiter.acquire()
 341                    if __debug__:
 342                        self._note("%s.wait(): got it", self)
 343                else:
 344                    # Balancing act:  We can't afford a pure busy loop, so we
 345                    # have to sleep; but if we sleep the whole timeout time,
(gdb) thread 2
[Switching to thread 2 (Thread 0xb73b8b40 (LWP 11040))]
#0  0xb7711c31 in __kernel_vsyscall ()
(gdb) py-list
   3    import time
   4
   5
   6    def do(x):
   7        x = x * 3
  >8        time.sleep(x * 60)
   9
  10
  11    def main():
  12        threads = []
  13        for x in range(1, 3):
(gdb)

同时操作所有线程

thread apply all CMD` 或 `t a a CMD
(gdb) thread apply all py-list

Thread 3 (Thread 0xb69ffb40 (LWP 11041)):
   3    import time
   4
   5
   6    def do(x):
   7        x = x * 3
  >8        time.sleep(x * 60)
   9
  10
  11    def main():
  12        threads = []
  13        for x in range(1, 3):

Thread 2 (Thread 0xb73b8b40 (LWP 11040)):
   3    import time
   4
   5
   6    def do(x):
   7        x = x * 3
  >8        time.sleep(x * 60)
   9
  10
  11    def main():
  12        threads = []
  13        for x in range(1, 3):

---Type <return> to continue, or q <return> to quit---
Thread 1 (Thread 0xb74b9700 (LWP 11039)):
 335            waiter.acquire()
 336            self.__waiters.append(waiter)
 337            saved_state = self._release_save()
 338            try:    # restore state no matter what (e.g., KeyboardInterrupt)
 339                if timeout is None:
>340                    waiter.acquire()
 341                    if __debug__:
 342                        self._note("%s.wait(): got it", self)
 343                else:
 344                    # Balancing act:  We can't afford a pure busy loop, so we
 345                    # have to sleep; but if we sleep the whole timeout time,
(gdb)

常用的 gdb python 相关的操作就是这些, 同时也不要忘记原来的 gdb 命令都是可以使用的哦。

iY3amiA.png!web

Git 高级用法,喜欢就拿去用!

Python 最强编辑器详细使用指南!

我是一个CPU: 这个世界慢! 死! 了!

EJZvyqZ.jpg!web

faUFJrF.png!web


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK