Calmcode - ibis: demo

Setting up Ibis

1 2 3 4 5 6 7

Before we can start using Ibis we need to install it. We can do this by running the following command. Be mindful that you also install the dependencies that you need. In this case we're installing the polars, duckdb and pandas dependencies as well.

python -m pip install 'ibis-framework[polars,duckdb,pandas]'

If you want to follow along with the course you will also need to download the same dataset, you can do that from a notebook by running this command:

! wget https://calmcode.io/static/data/birthdays.csv

Once everything is installed we can start running some code, but to enchange the experience in the notebook we recommend seting the interactive option to True. This will make sure that the output of the Ibis queries is rendered in a more readable way.

import ibis

ibis.options.interactive = True

Demo

As a first demo, let's read the birthday dataset with two of our backends.

con_polars = ibis.polars.connect()
tbl_polars = con_polars.read_csv("birthdays.csv")

con_duckdb = ibis.duckdb.connect()
tbl_duckdb = con_duckdb.read_csv("birthdays.csv")

This code sets up two connections first. One connection represents a polars backend while the other one points to a DuckDB backend. Both of these connections can load a csv file and return a table object. The cool thing here is that you can write code that run on these table without worrying about the backend under the hood.

def set_types(dataf):
    return dataf.mutate(dataf.date.to_date("%Y-%m-%d").name('date'))

def counter(dataf, *args):
    return (
        dataf
         .group_by(args)
         .agg(
             dataf.births.sum().name('sum'), 
             dataf.births.mean().name('mean')
         ).order_by(args)
    )

counter(tbl_duckdb, 'date')
counter(tbl_polars, 'date')

The counter-function can accept a Ibis table. No matter what backend, it will do an aggregation. That's great because it means that you no longer have to concern yourself with the backend and can focus on the data manipulation itself.