Calmcode - decorators: usage

Usage

1 2 3 4 5 6 7 8 9 10

Decorators are extremely powerful. A great use-case for them is to add logging to pandas pipelines. If you're interested in more information on the pandas example below, make sure you watch this course.

The pandas-logger decorator is defined below.

from functools import wraps
import datetime as dt

def log_step(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        tic = dt.datetime.now()
        result = func(*args, **kwargs)
        time_taken = str(dt.datetime.now() - tic)
        print(f"just ran step {func.__name__} shape={result.shape} took {time_taken}s")
        return result
    return wrapper

And you can see it being applied here;

import pandas as pd

df = pd.read_csv('https://calmcode.io/datasets/bigmac.csv')

@log_step
def start_pipeline(dataf):
    return dataf.copy()

@log_step
def set_dtypes(dataf):
    return (dataf
            .assign(date=lambda d: pd.to_datetime(d['date']))
            .sort_values(['currency_code', 'date']))

@log_step
def remove_outliers(dataf, min_row_country=32):
    countries = (dataf
                .groupby('currency_code')
                .agg(n=('name', 'count'))
                .loc[lambda d: d['n'] >= min_row_country]
                .index)
    return (dataf
            .loc[lambda d: d['currency_code'].isin(countries)])

df_new = (df
  .pipe(start_pipeline)
  .pipe(set_dtypes)
  .pipe(remove_outliers, min_row_country=20))

Another example of a useful decorator can be found in the retry package.

from retry import retry

import logging
logging.basicConfig()

@retry(ValueError, tries=5, delay=0.5)
def randomly_fails(p=0.5):
    if random.random() < p:
        raise ValueError("no bueno!")
    return "Done!"

randomly_fails()

There's plenty of other useful decorators. There's the lru-cache but there's also the multifile decorator that we mention in the video.