26

A Failed Experiment with Python Type Annotations

 4 years ago
source link: https://www.tuicool.com/articles/zERzEfU
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

I like Python, but wish it had static typing. The added safety would go a long way to improving quality and reducing development time. So today I tried to make use of type annotations and a static type-checker called mypy .

After a few basic tests, I was excited. But my glee turned to disappointment rather quickly. There are two fundamental issues that make it an unusable solution.

  • You can’t have self-referential classes in the type annotations, thus no containers
  • You can’t have inferred return type values, thus requiring extensive wasteful annotations

Let’s look at both problems.

Self-Referential

I’m trying to render some articles for Interview.Codes with my MDL processor . The parser uses a Node class to create a parse tree. This class contains Node children as part of a tree structure. Logically, that means I’d have functions like below.

class Node(Object):

    def add_sub( self, sub : Node ):
        ...

    def get_subs( self ) -> Sequence[Node]:
        ...

mypy has no trouble understanding this, but it’s unfortunately not valid Python code. You can’t refer to Node within the Node class.

The workaround suggested is using a TypeVar .

NodeT = TypeVar( `NodeT`, bound=`Node` )
class Node(Object):
    def add_sub( self, sub : NodeT ):
        ...

    def get_subs( self ) -> Sequence[NodeT]:
        ...

This is ugly. I’m reminded of C++’s _t pattern. Part of my attraction to Python is the simplified syntax. Having to decorate classes like this makes it far less appealing. Plus, it’s boiler-plate code adding overhead for later understanding.

The limitation in Python comes from Node not yet being in the symbol table. It doesn’t make it into the symbol table until after the class is processed, meaning you can’t use Node within the class. This is a limitation of the compiler. There’s no reason this needs to be this way, except perhaps for backwards compatibility with screwy old code.

Perhaps we can’t use the class name. But we could have a Self or Class symbol that refers to the enclosing class.

No Inferred Return Types

One of the great values of Python is not having to put types everywhere. You can write functions like below.

def get_value():
    return 123

Now, if you’re using TypeScript or C++ the compiler can happily infer the return type of functions. For unknown reasons, mypy choses not to infer the return types of functions. Instead, if there is no type annotation it assumes it returns type Any .

This means I must annotate all functions with information the static type checker already knows. It’s redundant and messy.

You’re additionally forced to learn the names and structure of all types. Ones you could otherwise safely ignore.

def get_iter():
    return iter(sequence)

def get_closure(self):
    return lamba q : self.op(q)

Why should I have to know the type that iter returns to write this function? Or do you have any idea what type get_closure returns? I know how to use the return, and can even reason it’s a function, but I’d have no idea how to specify its type. Knowing the myriad of types isn’t feasible. You’ll end up spending more time trying to tweak types than using the code.

This complexity helped drive the introduction of the auto keyword to C++. There are many situations where writing the type information isn’t workable. This is especially true when dealing with parametric container classes,

Inferring return types is an essential feature.

Avoiding it for now

These two problems repeat throughout my codebase. I’m okay when there’s a limitation that occasionally affects the code, but this is fundamental. To use type checking, I’d have to add the redundant class declarations to every container-like class. To use type checking at all, I’d have to annotate the return value of all functions.

Static type checking should not be a tradeoff and there’s no fundamental reason these limitations can’t be lifted. When these are fixed, I’ll happily come back and use type annotations.

Image Credit: Mari Carmen


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK