Decorators are extremely powerful. A great use-case for them is to add logging to pandas pipelines. If you're interested in more information on the pandas example below, make sure you watch this course.
The pandas-logger decorator is defined below.
from functools import wraps
import datetime as dt
def log_step(func):
@wraps(func)
def wrapper(*args, **kwargs):
tic = dt.datetime.now()
result = func(*args, **kwargs)
time_taken = str(dt.datetime.now() - tic)
print(f"just ran step {func.__name__} shape={result.shape} took {time_taken}s")
return result
return wrapper
And you can see it being applied here;
import pandas as pd
df = pd.read_csv('https://calmcode.io/datasets/bigmac.csv')
@log_step
def start_pipeline(dataf):
return dataf.copy()
@log_step
def set_dtypes(dataf):
return (dataf
.assign(date=lambda d: pd.to_datetime(d['date']))
.sort_values(['currency_code', 'date']))
@log_step
def remove_outliers(dataf, min_row_country=32):
countries = (dataf
.groupby('currency_code')
.agg(n=('name', 'count'))
.loc[lambda d: d['n'] >= min_row_country]
.index)
return (dataf
.loc[lambda d: d['currency_code'].isin(countries)])
df_new = (df
.pipe(start_pipeline)
.pipe(set_dtypes)
.pipe(remove_outliers, min_row_country=20))
Another example of a useful decorator can be found in the retry
package.
from retry import retry
import logging
logging.basicConfig()
@retry(ValueError, tries=5, delay=0.5)
def randomly_fails(p=0.5):
if random.random() < p:
raise ValueError("no bueno!")
return "Done!"
randomly_fails()
There's plenty of other useful decorators. There's the lru-cache but there's also the multifile decorator that we mention in the video.