Hands on with Python 3.7: what’s new in the latest release

Used for general purpose programming, data science, website backends, GUIs, and pretty much everything else; the first programming language for many, and claimed to be the fastest growing in the world, is of course Python. The newest version 3.7.0 has just recently been released.
Naturally any release of Python, no matter how small, undergoes meticulous planning and design before any development is started at all. In fact, you can read the PEP (Python Enhancement Proposal) for Python 3.7, which was created back in 2016.

What’s new in 3.7? Why should you upgrade? Is there anything new that’s actually useful? I’ll answer these questions for you by walking through some examples of the new features. Whilst there’s not much in this release that will make a difference to the Python beginner, there’s plenty of small changes for seasoned coders and a few headline features you’ll want to know about.

BREAKPOINTS ARE NOW BUILTINS

Anyone who has used the pdb (Python debugger) knows how powerful it is. It gives you the ability to pause the execution of your script, allowing you to manually roam around the internals of the program and step over individual lines.

But, up until now, it required some setup when writing a program. Sure, it takes practically no time at all for you to import pdb and set_trace(), but it’s not on the same level of convenience as chucking in a quick debug print() or log. As of Python 3.7, breakpoint() is a built-in, making it super easy to drop into a debugger anytime you like. It’s also worth noting that the pdb is just one of many debuggers available, and you can configure which one you’d like to use by setting the new PYTHONBREAKPOINT environment variable.

Here’s a quick example of a program that we’re having trouble with. The user is asked for a string, and we compare it to see if it matches a value.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
"""Test user's favourite Integrated Circuit."""
def test_ic(favourite_ic):
    user_guess = input("Try to guess our favourite IC >>> ")
    if user_guess == favourite_ic:
        return "Yup, that's our favourite!"
    else:
        return "Sorry, that's not our favourite IC"
if __name__ == '__main__':
    favourite_ic = 555
    print(test_ic(favourite_ic))

Unfortunately, no matter what is typed in, we can never seem to match the string.

1
2
3
$ python breakpoint_test.py
Try to guess our favourite IC >>> 555
Sorry, that's not our favourite IC

To figure out what’s going in, let’s chuck in a breakpoint — it’s as simple as calling breakpoint().

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
"""Test user's favourite Integrated Circuit."""
def test_ic(favourite_ic):
    user_guess = input("Try to guess our favourite IC >>> ")
    breakpoint()
    if user_guess == favourite_ic:
        return "Yup, that's our favourite!"
    else:
        return "Sorry, that's not our favourite IC"
if __name__ == '__main__':
    favourite_ic = 555
    print(test_ic(favourite_ic))

At the pdb prompt, we’ll call locals() to dump the current local scope. The pdb has a shedload of useful commands, but you can also run normal Python in it as well.

1
2
3
4
5
6
7
$ python breakpoint_test.py
Try to guess our favourite IC >>> 555
> /home/ben/Hackaday/python37/breakpoint_test.py(8)test_ic()
-> if user_guess == favourite_ic:
(Pdb) locals()
{'favourite_ic': 555, 'user_guess': '555'}
(Pdb)

Aha! It looks like favourite_ic is an integer, whilst user_guess is a string. Since in Python comparing a string to an int is a perfectly valid comparison, no exception was thrown (but the comparison doesn’t do what we want). favourite_ic should have been declared as a string. This is arguably one of the dangers of Python’s dynamic typing — there’s no way of catching this error until runtime. Unless, of course, you use type annotations…

ANNOTATIONS AND TYPING

Since Python 3.5, type annotations have been gaining traction. For those unfamiliar with type hinting, it’s a completely optional way of annotating your code to specify the types of variables.
Type hints are just one application of annotations (albeit the main one). What are annotations? They’re syntactic support for associating metadata with variables. They can be considered to be arbitrary expressions which are evaluated but ignored by Python at runtime. An annotation can be any valid Python expression. Here’s an example of an annotated function where we’ve gone bananas with useless information.

1
2
3
4
# Without annotation
def foo(bar, baz):
# Annotated
def foo(bar: 'Describe the bar', baz: print('random')) -> 'return thingy':

This is all very cool, but a bit meaningless unless annotations are used in standard ways. The syntax for using annotations for typing became standardised in Python 3.5 (PEP 484), and since then type hints have become widely used by the Python community. They’re purely a development aid, which can be checked using an IDE like PyCharm or a third party tool such as Mypy.

If our string comparison program had been written with type annotations, it would have looked like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
"""Test user's favourite Integrated Circuit."""
def test_ic(favourite_ic: str) -> str:
    user_guess: str = input("Try to guess our favourite IC >>> ")
    breakpoint()
    if user_guess == favourite_ic:
        return "Yup, that's our favourite!"
    else:
        return "Sorry, that's not our favourite IC"
if __name__ == '__main__':
    favourite_ic: int = 555
    print(test_ic(favourite_ic))

You can see that PyCharm has alerted me to the error here, which would have prevented it going un-noticed until runtime. If your project is using CI (Continuous Integration), you could even configure your pipeline to run Mypy or a similar third party tool on your code.

So that’s the basics of annotations and type hinting. What’s changing in Python 3.7? As the official Python docs point out, two main issues arose when people began to start using annotations for type hints: startup performance and forward references.

  • Unsurprisingly, evaluating tons of arbitrary expressions at definition time was quite costly for startup performance, as well as the fact that the typing module was extremely slow
  • You couldn’t annotate with types that weren’t declared yet

This lack of forward reference seems reasonable, but becomes quite a nuisance in practice.

1
2
3
class User:
    def __init__(self, name: str, prev_user: User) -> None:
        pass

This fails, as prev_user cannot be defined as type User, given that User is not defined yet.
To fix both of these issues, evaluation of annotations gets postponed. Annotations simply get stored as a string, and optionally evaluated if you really need them to be.

To implement this behaviour, a __future__ import must be used, since this change can’t be made whilst remaining compatible with previous versions.

1
2
3
4
5
from __future__ import annotations
class User:
    def __init__(self, name: str, prev_user: User) -> None:
        pass

This now executes without a problem, since the User type is simply not evaluated.

Part of the reason the typing module was so slow was that there was an initial design goal to implement the typing module without modifying the core CPython interpreter. However, now that the use of type hints is becoming more popular, this restriction has been removed, meaning that there is now core support for typing, which enables several optimisations.

TIMING

The time module has some new kids on the block: existing timer functions are getting a corresponding nanosecond flavour, meaning greater precision is on tap if required. Some benchmarks show that the resolution of time.time() is more than three times exceeded by that of time.time_ns().

Talking of timing, Python itself is getting a minor speed boost in 3.7. This is low level stuff so we won’t go into it right now, but here’s the full list of optimisations. All you need to know is that the startup time is 10% faster on Linux, 30% faster on MacOS, and a large number of method calls are getting zippier by up to 20%.

DATACLASSES

We’re willing to bet that if you’ve ever written object-oriented Python, you’ll have made a class that ended up looking something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class User:
    def __init__(self, name: str, age: int, favourite_ic: str) -> None:
        self.name = name
        self.age = age
        self.favourite_ic = favourite_ic
    def is_adult(self) -> bool:
        """Return True if user is an adult, else False."""
        return self.age >= 18
if __name__ == '__main__':
    john = User('John', 29, '555')
    print(john)
    # prints "<__main__.User object at 0x0076E610>"

A ton of different arguments are received in __init__ when the class gets initialised. These are simply set as attributes of the class instance straight away, ready for later use. This is a pretty common pattern when writing these kind of classes — but this is Python, and if tedium can be avoided, it should be.

As of 3.7, we have dataclasses, which will make this type of class easier to declare, and more readable.

Simply decorate a class with @dataclass, and the assignment to self will be taken care of automatically. Variables are declared as shown below, and type annotations are compulsory (though you can still use the Any type if you want to be flexible).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from dataclasses import dataclass
@dataclass
class User:
    name: str
    age: int
    favourite_ic: str
    def is_adult(self) -> bool:
        """Return True if user is an adult, else False."""
        return self.age >= 18
if __name__ == '__main__':
    john = User('John', 29, '555')
    print(john)
    # prints "User(name='John', age=29, favourite_ic='555')"

Not only was the class much easier to setup, but it also produced a lovely string when we created an instance and printed it out. It would also behave properly when being compared to other class instances. This is because, as well as auto-generating the __init__ method, other special methods were generated too, such as __repr____eq__ and __hash__.  These vastly reduce the amount of overhead needed when properly defining a class like this.

Dataclasses use fields to do what they do, and manually constructing a field() gives access to additional options which aren’t the defaults. For example, here the default_factory of the field has been set to a lambdafunction which prompts the user to enter their name.

1
2
3
4
from dataclasses import dataclass, field
class User:
    name: str = field(default_factory=lambda: input("enter name"))

(We wouldn’t recommend piping input into an attribute directly like this – it’s just a demo of what fields are capable of.)

OTHER

There are other miscellaneous changes aplenty in this release; we’ll just list a few of the most significant here:

  • Dictionaries are now guaranteed to preserve insertion order. This was informally implemented in 3.6, but is now an official language specification. The normal dict should now be able to replace collections.OrderedDict in most cases.
  • New documentation translations into French, Japanese and Korean.
  • Controlling access to module attributes is now much easier, as __getattr__ can now be defined at a module level. This makes it far easier to customise import behaviour, and implement features such as deprecation warnings.
  • A new developer mode for CPython.
  • .pyc files have the option to be deterministic, enabling reproducible builds — that is, the same byte-for-byte output is always produced for the same input file.

CONCLUSION

There are some really neat syntactic shortcuts and performance improvements to be had, but it might not be enough to encourage everyone to upgrade. Overall, Python 3.7 implements features that will genuinely lead to less hacky solutions, and produce cleaner code. We certainly look forward to using it, and can’t wait for 3.8!

Source: hackaday