Home >Backend Development >Python Tutorial >Commonly used Python debugging tools

Commonly used Python debugging tools

巴扎黑
巴扎黑Original
2017-04-30 15:52:281857browse

The following is an overview of the tools I use when doing debugging or analysis. If you know of a better tool, please leave it in the comments without giving a full introduction.

log

That’s right, it’s a log. The importance of keeping adequate logs in your application cannot be overemphasized. You should log the important stuff. If your logs are good enough, you can find the problem just by looking at the logs. That saves you a lot of time.

If you've been messing with print statements in your code, stop now. Use logging.debug instead. You can continue to reuse them in the future, or disable them all, etc.

Tracking

Sometimes a better approach is to look at which statements were executed. You can use some IDE's debugger to step through, but you need to know exactly which statements you are looking for, otherwise the whole process will proceed very slowly.
The trace module in the standard library can print all executed statements in the modules included in it during runtime. (Like making a project report)

python -mtrace –trace script.py

This will produce a lot of output (every line executed will be printed, you may want to grep to filter those modules that interest you).
For example:

python -mtrace –trace script.py | egrep '^(mod1.py|mod2.py)'

debugger

The following is a basic introduction that everyone should know by now:

import pdb
pdb.set_trace() # 开启pdb提示

Or

try:
(一段抛出异常的代码)
except:
    import pdb
    pdb.pm() # 或者 pdb.post_mortem()

Or (enter c to start executing the script)

python -mpdb script.py

In the input-calculation-output loop (Note: REPL, abbreviation of READ-EVAL-PRINT-LOOP) environment, the following operations can be performed:

  • c or continue


  • q or quit


  • l or list, display the source code of the current step frame


  • w or where, trace back the calling process


  • d or down, go back one frame (note: equivalent to rollback)


  • u or up, move forward one frame


  • (Enter), repeat the previous command

Nearly all the remaining instructions (except for a few other commands) are parsed as Python code on the current step frame.

If that's not challenging enough for you, try smiley - it shows you the variables and you can use it to trace the program remotely.

A better debugger

Direct replacement for pdb:
ipdb (easy_install ipdb) – similar to ipython (with automatic completion, display color, etc.)
pudb(easy_install pudb) – based on curses (similar to graphical interface interface), especially suitable for browsing source code

Remote debugger

Installation method:

sudo apt-get install winpdb

Use the following method to replace the previous pdb.set_trace():

import rpdb2
rpdb2.start_embedded_debugger("secretpassword")

Now run winpdb, file-association

Don’t like Winpdb? You can also directly wrap PDB to run on TCP!

Do this:

import loggging

class Rdb(pdb.Pdb):
    """
    This will run pdb as a ephemeral telnet service. Once you connect no one
    else can connect. On construction this object will block execution till a
    client has connected.

    Based on https://github.com/tamentis/rpdb I think ...

    To use this::

        Rdb(4444).set_trace()

    Then run: telnet 127.0.0.1 4444
    """
    def __init__(self, port=0):
        self.old_stdout = sys.stdout
        self.old_stdin = sys.stdin
        self.listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.listen_socket.bind(('0.0.0.0', port))
        if not port:
            logging.critical("PDB remote session open on: %s", self.listen_socket.getsockname())
            print >> sys.__stderr__, "PDB remote session open on:", self.listen_socket.getsockname()
            sys.stderr.flush()
        self.listen_socket.listen(1)
        self.connected_socket, address = self.listen_socket.accept()
        self.handle = self.connected_socket.makefile('rw')
        pdb.Pdb.__init__(self, completekey='tab', stdin=self.handle, stdout=self.handle)
        sys.stdout = sys.stdin = self.handle

    def do_continue(self, arg):
        sys.stdout = self.old_stdout
        sys.stdin = self.old_stdin
        self.handle.close()
        self.connected_socket.close()
        self.listen_socket.close()
        self.set_continue()
        return 1

    do_c = do_cont = do_continue

def set_trace():
    """
    Opens a remote PDB on first available port.
    """
    rdb = Rdb()
    rdb.set_trace()

Just want a REPL environment? How about trying IPython?

If you don’t need a complete debugger, you only need to start an IPython in the following way:

import IPython
IPython.embed()

Standard linux tools

I'm often surprised at how underutilized they are. You can use these tools to solve a wide range of problems: from performance problems (too many system calls, memory allocations, etc.) to deadlocks, network problems, disk problems, and more.
The most useful one is strace, which is the most direct. You only need to run sudo strace -p 12345 or strace -f command (-f means tracking the child processes coming out of the fork at the same time), and that's it. The output will typically be quite large, so you may want to redirect it to a file for more analysis (just add &> to the filename).

Then there is ltrace, which is somewhat similar to strace. The difference is that it outputs library function calls. The parameters are roughly the same.

There is also lsof used to indicate the meaning of the handle values ​​you see in ltrace/strace. For example:

lsof -p 12345

Better tracking

It's easy to use and can do a lot - everyone should install htop!

sudo apt-get install htop
sudo htop

Now find the processes you want and enter:

s - 代表系统调用过程(类似strace)
L - 代表库调用过程(类似ltrace)
l - 代表lsof

Monitoring

There is no good continuous server monitoring, but if you ever encounter something weird, like why everything is running so slowly, what are all those system resources doing? . . When you want to understand these problems but have nowhere to start, you don't have to use tools like iotop, iftop, htop, iostat, vmstat, just use dstat! It can do most of the things we mentioned before, and maybe even better!
It will continuously display data to you in a compact, code-highlighted way (different from iostat, vmstat), and you can often see past data (unlike iftop, iostop, htop).

Just run:

dstat --cpu --io --mem --net --load --fs --vm --disk-util --disk-tps --freespace --swap --top-io --top-bio-adv

  很可能有一种更简短的方式来写上面这条命令,

  这是一个相当复杂而又强大的工具,但是这里我只提到了一些基本的内容(安装以及基础的命令)

sudo apt-get install gdb python-dbg
zcat /usr/share/doc/python2.7/gdbinit.gz > ~/.gdbinit

  用python2.7-dbg 运行程序:

sudo gdb -p 12345

  现在使用:

bt - 堆栈跟踪(C 级别)
pystack - python 堆栈跟踪,不幸的是你需要有~/.gdbinit 并且使用python-dbg
c - 继续

  发生段错误?用faulthandler !

  python 3.3版本以后新增的一个很棒的功能,可以向后移植到python2.x版本。只需要运行下面的语句,你就可以大抵知道什么原因引起来段错误。

import faulthandler
faulthandler.enable()

  内存泄露

  嗯,这种情况下有很多的工具可以使用,其中有一些专门针对WSGI的程序比如Dozer,但是我最喜欢的当然是objgraph。使用简单方便,让人惊讶!

  它没有集成WSGI或者其他,所以你需要自己去发现运行代码的方法,像下面这样:

import objgraph
objs = objgraph.by_type("Request")[:15]
objgraph.show_backrefs(objs, max_depth=20, highlight=lambda v: v in objs,
filename="/tmp/graph.png")
Graph written to /tmp/objgraph-zbdM4z.dot (107 nodes)
Image generated as /tmp/graph.png

  你会得到像这样一张图(注意:它非常大)。你也可以得到一张点输出。

  内存使用

  有时你想少用些内存。更少的内存分配常常可以使程序执行的更快,更好,用户希望内存合适好用)
有许多可用的工具,但在我看来最好用的是pytracemalloc。与其他工具相比,它开销非常小(不需要依赖于严重影响速度的sys.settrace)而且输出非常详尽。但安装起来比较痛苦,你需要重新编译python,但有了apt,做起来也非常容易。

  只需要运行这些命令然后去吃顿午餐或者干点别的:

apt-get source python2.7
cd python2.7-*
wget? https://github.com/wyplay/pytracemalloc/raw/master/python2.7_track_free_list.patch
patch -p1 < python2.7_track_free_list.patch
debuild -us -uc
cd ..
sudo dpkg -i python2.7-minimal_2.7*.deb python2.7-dev_*.deb

  接着安装pytracemalloc (注意如果你在一个virtualenv虚拟环境下操作,你需要在重新安装python后再次重建 – 只需要运行 virtualenv myenv)

pip install pytracemalloc

  现在像下面这样在代码里包装你的应用程序

import tracemalloc, time
tracemalloc.enable()
top = tracemalloc.DisplayTop(
    5000, # log the top 5000 locations
    file=open(&#39;/tmp/memory-profile-%s&#39; % time.time(), "w")
)
top.show_lineno = True
try:
    # code that needs to be traced
finally:
    top.display()

  输出会像这样:

2013-05-31 18:05:07: Top 5000 allocations per file and line
 #1: .../site-packages/billiard/_connection.py:198: size=1288 KiB, count=70 (+0),
average=18 KiB
 #2: .../site-packages/billiard/_connection.py:199: size=1288 KiB, count=70 (+0),
average=18 KiB
 #3: .../python2.7/importlib/__init__.py:37: size=459 KiB, count=5958 (+0),
average=78 B
 #4: .../site-packages/amqp/transport.py:232: size=217 KiB, count=6960 (+0),
average=32 B
 #5: .../site-packages/amqp/transport.py:231: size=206 KiB, count=8798 (+0),
average=24 B
 #6: .../site-packages/amqp/serialization.py:210: size=199 KiB, count=822 (+0),
average=248 B
 #7: .../lib/python2.7/socket.py:224: size=179 KiB, count=5947 (+0), average=30
B
 #8: .../celery/utils/term.py:89: size=172 KiB, count=1953 (+0), average=90 B
 #9: .../site-packages/kombu/connection.py:281: size=153 KiB, count=2400 (+0),
average=65 B
 #10: .../site-packages/amqp/serialization.py:462: size=147 KiB, count=4704
(+0), average=32 B

  …

  很美,不是吗?

  补充:更多有关调试的内容见这里。

  原文链接: Ionel Cristian Mărieș   翻译: 伯乐在线 - 高磊

The above is the detailed content of Commonly used Python debugging tools. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn