Question

Does Python evaluate type hinting of a forward reference?

I was looking at the PEP 484 section on Forward References and noticed the statement:

...that definition may be expressed as a string literal, to be resolved later.

And that got me wondering, when is "later" and by what? The interpreter doesn't try to resolve it as a literal later, so what does? Is it just if a third party tool is written to do that?

Small example to demonstrate the interpreter result:

class A:
    def test(self, a: 'A') -> None:
        pass
class B:
    def test(self, a: A) -> None:
        pass

>>> A().test.__annotations__
{'a': 'A', 'return': None}
>>> B().test.__annotations__
{'a': <class '__main__.A'>, 'return': None}

If my understanding of function annotations and type hints is correct, Python doesn't really do anything with them at runtime to improve performance, but rather the introspective use allows strictly third party applications such as linters, IDEs and static analysis tools (such as mypy) to take advantage of their availability. So would those tools try to resolve the type hint of 'A' rather than having that be a job given to the interpreter and if so, how do they accomplish this?

By using the typing module, user code can perform the following:

>>> typing.get_type_hints(A().test)
{'a': <class '__main__.A'>, 'return': <class 'NoneType'>}
>>> typing.get_type_hints(B().test)
{'a': <class '__main__.A'>, 'return': <class 'NoneType'>}

However, my question is aimed at whether or not Python has any responsibility in updating the __annotations__ of a function from a string literal, that is to say at runtime change:

>>> A().test.__annotations__
{'a': 'A', 'return': None}

to...

>>> A().test.__annotations__
{'a': <class '__main__.A'>, 'return': None}

If Python doesn't do it, then why would I want a string literal as a type hint other than for self-documented code? What value does the first form give to me, a user or a third party tool?

 48  29254  48
1 Jan 1970

Solution

 97

Consider the following code:

class Foo:
    def bar(self) -> Foo:
        return Foo()

This program will actually crash at runtime if you try running it with Python: when the interpreter sees the definition of bar, the definition of Foo is not yet finished. So, since Foo has not yet been added to the global namespace, we can't use it as a type hint yet.

Similarly, consider this program:

class Foo:
    def bar(self) -> Bar:
        return Bar()

class Bar:
    def foo(self) -> Foo:
        return Foo()

This mutually dependent definition suffers from the same problem: while we're evaluating Foo, Bar hasn't been evaluated yet so the interpreter throws an exception.


There are three solutions to this problem. The first is to make some of your type hints strings, effectively "forward declaring" them:

class Foo:
    def bar(self) -> "Foo":
        return Foo()

This satisfies the Python interpreter, and won't disrupt third party tools like mypy: they can just remove the quotes before parsing the type. The main disadvantage is that this syntax looks sort of ugly and clunky.

The second solution is to use type comments syntax:

class Foo:
    def bar(self):
        # type: () -> Foo
        return Foo()

This has the same benefits and disadvantages as the first solution: it satisfies the interpreter and tooling, but looks hacky and ugly. It also has the additional benefit that it keeps your code backwards-compatibile with Python 2.7.

The third solution is Python 3.7+ only -- use the from __future__ import annotations directive:

from __future__ import annotations 

class Foo:
    def bar(self) -> Foo:
        return Foo()

This will automatically make all annotations be represented as strings. So we get the benefit of the first solution, but without the ugliness.

This behavior will eventually become the default in future versions of Python.

It also turns out that automatically making all annotations strings can come with some performance improvements. Constructing types like List[Dict[str, int]] can be surprisingly expensive: they're just regular expressions at runtime and evaluated as if they were written as List.__getitem__(Dict.__getitem__((str, int)).

Evaluating this expression is somewhat expensive: we end up performing two method calls, constructing a tuple, and constructing two objects. This isn't counting any additional work that happens in the __getitem__ methods themselves, of course -- and the work that happens in those methods ends up being non-trivial out of necessity.

(In short, they need to construct special objects that ensure types like List[int] can't be used in inappropriate ways at runtime -- e.g. in isinstance checks and the like.)

2019-03-25