Calmcode - diskcache: sqlite

Sqlite

1 2 3 4 5

To appreciate some of the details in diskcache it might help to explore the contents of the sqlite file. Before we do that, let's start fresh with a new cache.

from diskcache import Cache

cache = Cache("inspect")

for i, char in enumerate("abcdefg"):
    cache.set(i, char, expire=60, tag="demo")

Now that the cache is populated, let's explore it. You can find the sqlite database in the inspect folder, which is created when we run Cache("inspect").

datasette inspect/cache.db

When you explore the dataset in the UI, you'll see a table that looks like this.

rowid key raw store_time expire_time access_time access_count tag size mode filename value
1 0 1 1704372802.609287 1704372862.609287 1704372802.609287 0 demo 0 1 a
2 1 1 1704372802.609801 1704372862.609801 1704372802.609801 0 demo 0 1 b
3 2 1 1704372802.6100879 1704372862.6100879 1704372802.6100879 0 demo 0 1 c
4 3 1 1704372802.6103418 1704372862.6103418 1704372802.6103418 0 demo 0 1 d
5 4 1 1704372802.610549 1704372862.610549 1704372802.610549 0 demo 0 1 e
6 5 1 1704372802.61091 1704372862.61091 1704372802.61091 0 demo 0 1 f
7 6 1 1704372802.611135 1704372862.611135 1704372802.611135 0 demo 0 1 g

Hopefully you can recognize the correspondence with the areforementioned code.

Pickle

Let's now try and cache some other type of object, maybe a Python dictionary.

for i, char in enumerate("abcdefg"):
    cache.set(i, {"a": 1}, expire=60, tag="demo")

The table will look very similar, but the value column will have different content.

rowid key raw store_time expire_time access_time access_count tag size mode filename value
1 0 1 1704372893.624529 1704372953.624529 1704372893.624529 0 demo 0 4 <Binary: 21 bytes>
2 1 1 1704372893.626837 1704372953.626837 1704372893.626837 0 demo 0 4 <Binary: 21 bytes>
3 2 1 1704372893.628146 1704372953.628146 1704372893.628146 0 demo 0 4 <Binary: 21 bytes>
4 3 1 1704372893.628565 1704372953.628565 1704372893.628565 0 demo 0 4 <Binary: 21 bytes>
5 4 1 1704372893.629778 1704372953.629778 1704372893.629778 0 demo 0 4 <Binary: 21 bytes>
6 5 1 1704372893.630381 1704372953.630381 1704372893.630381 0 demo 0 4 <Binary: 21 bytes>
7 6 1 1704372893.630854 1704372953.630854 1704372893.630854 0 demo 0 4 <Binary: 21 bytes>

You may wonder what's up with that. The thinking is that a dictionary in Python represents a type that Sqlite does not support natively. So under the hood the Python dictionary is serialized into a binary format, using Pickle, so that it can be stored in Sqlite that way. This is pretty neat because it means that Sqlite is very flexible in what it can store, but it's good to be aware of what is happening under the hood.

Numpy

Pickle is pretty flexible in Python land and this means that you can store things that normally wouldn't really "fit" in a normal database. Just to give an example: you can store Numpy arrays!

import numpy as np

# Store the Numpy arrays
for i in range(10):
    cache.set(i, np.ones(10) * i, expire=60, tag="demo")

# Fetch them
for i in range(10):
    print(cache.get(i))

Custom

This approach is very flexible, but it's good to be aware that there will be overhead involved when Diskcache needs to pickle. In many cases this approach is fine, but there is a mechanism in the library that allows you to control how to move objects in and out of sqlite. Check this part of the docs for more information.