How To Make Python Faster

How To Make Python Faster

Follow These Quick Wins To Make Python Run Faster

Python is too slow, often people complain. But wait, there are many ways to improve its performance

This article aims to highlight the key tips in a succinct manner. It is useful for anyone who wants to improve the performance of your python code.

I reduced the execution time of an application by 1/10th

I am sharing these steps as I believe they are quick-wins and hopefully, they will help someone in near future.

It’s a given that the foremost important task is to write clean and efficient code first. Once the clean code is implemented, only then follow these 10 tips. I will explain them in detail now.

Firstly, How Did I Measure Time & Complexity Of My Code?

I used the python profiler that measures the time and space complexity of a program. You can pass in optional output file using -o parameter to keep a log of the performance

python -m cProfile [-o output_file] my_python_file.py

For those who want to get into Python, here’s a must-read article:

Use hash-table based data structures

  • If your application is going to perform a large number of search operations on a large collection of items and you do not have duplicates in your items then use a dictionary.
  • It’s high performant data collection.
  • Operation costs O(1) to search for an item.
  • Having said that, it is probably not going to be efficient if you do not have many items in your collection.

Instead of:

items = ['a', 'b',..,'100m'] #1000s of items
found = False
for i in items:
  if (i == '100m'):
   found = True

Do:

items = {'a':'a', 'b':'b:,..,'100m':'100m'} #each item is key/value
found = False
if '100m' in items:
  found = True

Use look ups if you can instead of looping over the collections

Use Vectorisation Instead Of Loops

Try To Use Python Libraries That Are Built On C, such as Numpy, Scipy and Pandas and take advantage of vectorisation. Instead of writing a loop that processes a single element in an array M number of times, it can process the items simultaneously. Vectorisation often involves optimised bucketing strategy

import numpy as np
array = np.array([[1., 2., 3.], [4., 5., 6.]])
m_array = array*array

Reduce the number of lines in your code

Use inbuilt Python functions such as map()

Instead of:

newlist = []
def my_fun(a):
  return a + 't'
for w in some_list:
    newlist.append(my_fun(w))

Do:

def my_fun(a):
  return a + 't'
newlist = map(my_fun, some_list)

Updating string variable creates a new instance each time

Instead of:

my_var = 'Malik'
myname_blog = 'Farhad ' + my_var + ' FinTechExplained'

Do:

my_var = 'Malik'
myname_blog = 'Farhad {0} FinTechExplained'.format(my_var)

The style above will reduce memory footprint.

Use for loop comprehension to reduce the lines

Instead of:

for x in big_x:
    for y in x.Y:
       items.append(x.A + y.A)

Do:

items = [x.A+y.A for x in big_x for y in x.Y]

Use Multiprocessing

If your computer has more than one process then look to use multiproccessing in Python.

Multiprocessing enables parallelisation in your code. Multiprocessing is costly as you are going to instantiate new processes, access shared memory etc, so only use multiproccessing if there is large enough data that you can split. For very little amount of data, using multiprocessing is not always worth the effort.

Instead of

def some_func(d):    
  #computations
data = [1,2,..,10000] #large data
for d in data: 
  some_func(d)

Do

import multiprocessing
def some_func(d):    
  #computations
data = [1,2,..,10000] #large data
pool = multiprocessing.Pool(processes=number_of_processors)
r = pool.map(some_func, data)
pool.close()

Multiprocessing is the biggest factor as I execute multiple execution paths simultaneously.

Use Cython

Cython is an static compiler that can optimise your code for you.

Load cypthonmagic extensions and use the cython tag to compile the code using cython.

Use Pip to install Cython:

pip install Cython

To use Cython:

%load_ext cythonmagic
%%cython
def do_work():
 ... #computationally intensive work

Don’t Use Excel If You Don’t Have To

Recently, I was implementing an application. It was taking a longer time for me to load and save the data to/from excel files. Instead, I chose the path of creating multiple csv files and created a folder to group the files.

Note: It depends on your use case. May be you can create multiple csv files and just have a utility in a native language that can combine multiple csv files into an excel file, if excel creation is a bottleneck.

Instead of:

df = pd.DataFrame([['a', 'b'], ['c', 'd']],index=['row 1', 'row 2'],columns=['col 1', 'col 2'])
df.to_excel("my.xlsx")
df2 = df.copy()
with pd.ExcelWriter('my.xlsx') as writer: 
   df.to_excel(writer, sheet_name='Sheet_name_1')
   df2.to_excel(writer, sheet_name='Sheet_name_2')

Do:

df = pd.DataFrame([['a', 'b'], ['c', 'd']],index=['row 1', 'row 2'],columns=['col 1', 'col 2'])
df2 = df.copy()
df.to_csv("my.csv")
df2.to_csv("my.csv")

Use Numba

It’s a JIT (just-in-time) compiler. Through decorators, Numba compiles annotated Python and NumPy code to LLVM.

Split your function in two parts:
1. Function that performs calculation — decorate using @autojit

2. Function that performs IO

from numba import jit, autojit
@autojit
def calculation(a):
    ....

def main():
    calc_result = calculation(some_object)
    
    d = np.array(calc_result)
    #save to file
    return d

Use Dask to parallelise Pandas DataFrame operations

Dask is great! It helped me process numerical functions in data frames and numpy in parallel. I have even attempted to scale it on a cluster and it was just so easy!

import pandas as pd
import dask.dataframe as dd
from dask.multiprocessing import get
data = pd.DataFrame(...) #large data set
def my_time_consuming_function(d):
  .... #long running function
ddata = dd.from_pandas(data, npartitions=30)
def apply_my_func(df): 
  return df.apply(
        (lambda row: my_time_consuming_function(*row)), axis=1)
def dask_apply(): 
    return     ddata.map_partitions(apply_my_func).compute(get=get)

Use swifter package

Swifter uses Dask in the background. It automatically figures out the most efficient way to parallelise a function on a dataframe.

It is a plugin for Pandas.

import swifter
import pandas as pd
a_large_data_frame = pd.DataFrame(...) #large data set
def my_time_consuming_function(data):
    ...

result = a_large_data_frame.swifter.apply(some_function)

Use Pandarallel package

Pandarallel can parallelise pandas operations on multiple processes.

Again, only use if you have a large data set.

from pandarallel import pandarallel
from math import sin

pandarallel.initialize()


# ALLOWED
def my_time_consuming_function(x):
    ....

df.parallel_apply(my_time_consuming_function, axis=1)

General Tips

  • It’s a given that the foremost important task is to write clean and efficient code first. We have to ensure that the code is not performing the same calculations within a loop over and over again.
  • Also it’s important to not open/close IO connections for every record in a collection.
  • Analyse if the objects can be cached.
  • Ensure you are not creating new instances of objects when they are not required.
  • Additionally, ensure your code is not performing repetitively identical computationally intensive tasks and it is written in a succinct manner.

Once the clean code is implemented, then follow these tips that have been outlined above.

Summary

This article aimed to highlight the key tips in a succinct manner. It is useful for anyone who wants to improve the performance of your python code.

Hope it helps.

Source: towardsdatascience