5

如何获取Python对象的源代码

 2 years ago
source link: https://www.lujun9972.win/blog/2018/05/18/%E5%A6%82%E4%BD%95%E8%8E%B7%E5%8F%96python%E5%AF%B9%E8%B1%A1%E7%9A%84%E6%BA%90%E4%BB%A3%E7%A0%81/index.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

如何获取Python对象的源代码

有时候我们会想要知道某个Python函数、类或者模块(我们把这些东西统称为Python的对象)是怎么定义的,或者找出它们是在哪个文件中定义的。

有两个库能够帮我们做到这一点,一个是内置的 inspect 库,一个是第三方的 dill

inspect库

help(inspect) 能看到关于 inspect 的描述如下:

DESCRIPTION
    This module encapsulates the interface provided by the internal special
    attributes (co_*, im_*, tb_*, etc.) in a friendlier fashion.
    It also provides some help for examining source code and class layout.

我们主要是使用该模块的 getsource* 系列函数,即:

  1. inspect.getsource(object) 函数以单字符串的形式返回某个python对象的源代码

    import inspect
    return inspect.getsource(inspect.getsource)
    
    def getsource(object):
        """Return the text of the source code for an object.
    
        The argument may be a module, class, method, function, traceback, frame,
        or code object.  The source code is returned as a single string.  An
        OSError is raised if the source code cannot be retrieved."""
        lines, lnum = getsourcelines(object)
        return ''.join(lines)
    
  2. inspect.getsourcefile(object) 函数返回某个python对象在哪个文件中定义

    import inspect
    return inspect.getsourcefile(inspect.getsourcefile)
    
    /usr/lib/python3.6/inspect.py
    
  3. inspect.getsourcelines(object) 函数返回一个元组,其中包含了python对象源代码的list,以及其在源代码中定义的起始行

    import inspect
    return inspect.getsourcelines(inspect.getsourcelines)
    
    (['def getsourcelines(object):\n', '    """Return a list of source lines and starting line number for an object.\n', '\n', '    The argument may be a module, class, method, function, traceback, frame,\n', '    or code object.  The source code is returned as a list of the lines\n', '    corresponding to the object and the line number indicates where in the\n', '    original source file the first line of code was found.  An OSError is\n', '    raised if the source code cannot be retrieved."""\n', '    object = unwrap(object)\n', '    lines, lnum = findsource(object)\n', '\n', '    if ismodule(object):\n', '        return lines, 0\n', '    else:\n', '        return getblock(lines[lnum:]), lnum + 1\n'], 946)
    

然而, inspect 有一个缺陷,那就是无法获取interactive python session中定义对象的源代码:

import inspect
def sum(a,b):
    return a+b
return inspect.getsource(sum)
Traceback (most recent call last):
  File "<stdin>", line 8, in <module>
  File "<stdin>", line 6, in main
  File "/usr/lib/python3.6/inspect.py", line 968, in getsource
    lines, lnum = getsourcelines(object)
  File "/usr/lib/python3.6/inspect.py", line 955, in getsourcelines
    lines, lnum = findsource(object)
  File "/usr/lib/python3.6/inspect.py", line 786, in findsource
    raise OSError('could not get source code')
OSError: could not get source code

若想要能够获取interactive python session中定义对象的源代码,则可以使用第三方的 dill

dill库

dill 有一个 source 子module,它提供了跟 inspect 非常类似的API接口:

  1. dill.source.getsource函数

    import dill
    return dill.source.getsource(dill.source.getsource)
    
    def getsource(object, alias='', lstrip=False, enclosing=False, \
                                                  force=False, builtin=False):
        """Return the text of the source code for an object. The source code for
        interactively-defined objects are extracted from the interpreter's history.
    
        The argument may be a module, class, method, function, traceback, frame,
        or code object.  The source code is returned as a single string.  An
        IOError is raised if the source code cannot be retrieved, while a
        TypeError is raised for objects where the source code is unavailable
        (e.g. builtins).
    
        If alias is provided, then add a line of code that renames the object.
        If lstrip=True, ensure there is no indentation in the first line of code.
        If enclosing=True, then also return any enclosing code.
        If force=True, catch (TypeError,IOError) and try to use import hooks.
        If builtin=True, force an import for any builtins
        """
        # hascode denotes a callable
        hascode = _hascode(object)
        # is a class instance type (and not in builtins)
        instance = _isinstance(object)
    
        # get source lines; if fail, try to 'force' an import
        try: # fails for builtins, and other assorted object types
            lines, lnum = getsourcelines(object, enclosing=enclosing)
        except (TypeError, IOError): # failed to get source, resort to import hooks
            if not force: # don't try to get types that findsource can't get
                raise
            if not getmodule(object): # get things like 'None' and '1'
                if not instance: return getimport(object, alias, builtin=builtin)
                # special handling (numpy arrays, ...)
                _import = getimport(object, builtin=builtin)
                name = getname(object, force=True)
                _alias = "%s = " % alias if alias else ""
                if alias == name: _alias = ""
                return _import+_alias+"%s\n" % name
            else: #FIXME: could use a good bit of cleanup, since using getimport...
                if not instance: return getimport(object, alias, builtin=builtin)
                # now we are dealing with an instance...
                name = object.__class__.__name__
                module = object.__module__
                if module in ['builtins','__builtin__']:
                    return getimport(object, alias, builtin=builtin)
                else: #FIXME: leverage getimport? use 'from module import name'?
                    lines, lnum = ["%s = __import__('%s', fromlist=['%s']).%s\n" % (name,module,name,name)], 0
                    obj = eval(lines[0].lstrip(name + ' = '))
                    lines, lnum = getsourcelines(obj, enclosing=enclosing)
    
        # strip leading indent (helps ensure can be imported)
        if lstrip or alias:
            lines = _outdent(lines)
    
        # instantiate, if there's a nice repr  #XXX: BAD IDEA???
        if instance: #and force: #XXX: move into findsource or getsourcelines ?
            if '(' in repr(object): lines.append('%r\n' % object)
           #else: #XXX: better to somehow to leverage __reduce__ ?
           #    reconstructor,args = object.__reduce__()
           #    _ = reconstructor(*args)
            else: # fall back to serialization #XXX: bad idea?
                #XXX: better not duplicate work? #XXX: better new/enclose=True?
                lines = dumpsource(object, alias='', new=force, enclose=False)
                lines, lnum = [line+'\n' for line in lines.split('\n')][:-1], 0
           #else: object.__code__ # raise AttributeError
    
        # add an alias to the source code
        if alias:
            if hascode:
                skip = 0
                for line in lines: # skip lines that are decorators
                    if not line.startswith('@'): break
                    skip += 1
                #XXX: use regex from findsource / getsourcelines ?
                if lines[skip].lstrip().startswith('def '): # we have a function
                    if alias != object.__name__:
                        lines.append('\n%s = %s\n' % (alias, object.__name__))
                elif 'lambda ' in lines[skip]: # we have a lambda
                    if alias != lines[skip].split('=')[0].strip():
                        lines[skip] = '%s = %s' % (alias, lines[skip])
                else: # ...try to use the object's name
                    if alias != object.__name__:
                        lines.append('\n%s = %s\n' % (alias, object.__name__))
            else: # class or class instance
                if instance:
                    if alias != lines[-1].split('=')[0].strip():
                        lines[-1] = ('%s = ' % alias) + lines[-1]
                else:
                    name = getname(object, force=True) or object.__name__
                    if alias != name:
                        lines.append('\n%s = %s\n' % (alias, name))
        return ''.join(lines)
    
  2. dill.source.getsourcefile函数

    import dill
    return dill.source.getsourcefile(dill.source.getsourcefile)
    
    /usr/lib/python3.6/inspect.py
    
  3. dill.source.getsourcelines函数

    import dill
    return dill.source.getsourcelines(dill.source.getsourcelines)
    
    (['def getsourcelines(object, lstrip=False, enclosing=False):\n', '    """Return a list of source lines and starting line number for an object.\n', "    Interactively-defined objects refer to lines in the interpreter's history.\n", '\n', '    The argument may be a module, class, method, function, traceback, frame,\n', '    or code object.  The source code is returned as a list of the lines\n', '    corresponding to the object and the line number indicates where in the\n', '    original source file the first line of code was found.  An IOError is\n', '    raised if the source code cannot be retrieved, while a TypeError is\n', '    raised for objects where the source code is unavailable (e.g. builtins).\n', '\n', '    If lstrip=True, ensure there is no indentation in the first line of code.\n', '    If enclosing=True, then also return any enclosing code."""\n', '    code, n = getblocks(object, lstrip=lstrip, enclosing=enclosing, locate=True)\n', '    return code[-1], n[-1]\n'], 300)
    

不过它还支持获取interactive python session中定义对象的源代码: screenshot-01.png


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK