To appreciate some of the details in diskcache
it might help to explore the contents of the sqlite
file. Before we do that, let's start fresh with a new cache.
from diskcache import Cache
cache = Cache("inspect")
for i, char in enumerate("abcdefg"):
cache.set(i, char, expire=60, tag="demo")
Now that the cache is populated, let's explore it. You can find the sqlite database in the inspect
folder, which is created when we run Cache("inspect")
.
datasette inspect/cache.db
When you explore the dataset in the UI, you'll see a table that looks like this.
rowid | key | raw | store_time | expire_time | access_time | access_count | tag | size | mode | filename | value |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0 | 1 | 1704372802.609287 | 1704372862.609287 | 1704372802.609287 | 0 | demo | 0 | 1 | a | |
2 | 1 | 1 | 1704372802.609801 | 1704372862.609801 | 1704372802.609801 | 0 | demo | 0 | 1 | b | |
3 | 2 | 1 | 1704372802.6100879 | 1704372862.6100879 | 1704372802.6100879 | 0 | demo | 0 | 1 | c | |
4 | 3 | 1 | 1704372802.6103418 | 1704372862.6103418 | 1704372802.6103418 | 0 | demo | 0 | 1 | d | |
5 | 4 | 1 | 1704372802.610549 | 1704372862.610549 | 1704372802.610549 | 0 | demo | 0 | 1 | e | |
6 | 5 | 1 | 1704372802.61091 | 1704372862.61091 | 1704372802.61091 | 0 | demo | 0 | 1 | f | |
7 | 6 | 1 | 1704372802.611135 | 1704372862.611135 | 1704372802.611135 | 0 | demo | 0 | 1 | g |
Hopefully you can recognize the correspondence with the areforementioned code.
Pickle
Let's now try and cache some other type of object, maybe a Python dictionary.
for i, char in enumerate("abcdefg"):
cache.set(i, {"a": 1}, expire=60, tag="demo")
The table will look very similar, but the value
column will have different content.
rowid | key | raw | store_time | expire_time | access_time | access_count | tag | size | mode | filename | value |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0 | 1 | 1704372893.624529 | 1704372953.624529 | 1704372893.624529 | 0 | demo | 0 | 4 | <Binary: 21 bytes> | |
2 | 1 | 1 | 1704372893.626837 | 1704372953.626837 | 1704372893.626837 | 0 | demo | 0 | 4 | <Binary: 21 bytes> | |
3 | 2 | 1 | 1704372893.628146 | 1704372953.628146 | 1704372893.628146 | 0 | demo | 0 | 4 | <Binary: 21 bytes> | |
4 | 3 | 1 | 1704372893.628565 | 1704372953.628565 | 1704372893.628565 | 0 | demo | 0 | 4 | <Binary: 21 bytes> | |
5 | 4 | 1 | 1704372893.629778 | 1704372953.629778 | 1704372893.629778 | 0 | demo | 0 | 4 | <Binary: 21 bytes> | |
6 | 5 | 1 | 1704372893.630381 | 1704372953.630381 | 1704372893.630381 | 0 | demo | 0 | 4 | <Binary: 21 bytes> | |
7 | 6 | 1 | 1704372893.630854 | 1704372953.630854 | 1704372893.630854 | 0 | demo | 0 | 4 | <Binary: 21 bytes> |
You may wonder what's up with that. The thinking is that a dictionary in Python represents a type that Sqlite does not support natively. So under the hood the Python dictionary is serialized into a binary format, using Pickle, so that it can be stored in Sqlite that way. This is pretty neat because it means that Sqlite is very flexible in what it can store, but it's good to be aware of what is happening under the hood.
Numpy
Pickle is pretty flexible in Python land and this means that you can store things that normally wouldn't really "fit" in a normal database. Just to give an example: you can store Numpy arrays!
import numpy as np
# Store the Numpy arrays
for i in range(10):
cache.set(i, np.ones(10) * i, expire=60, tag="demo")
# Fetch them
for i in range(10):
print(cache.get(i))
Custom
This approach is very flexible, but it's good to be aware that there will be overhead involved when Diskcache needs to pickle. In many cases this approach is fine, but there is a mechanism in the library that allows you to control how to move objects in and out of sqlite. Check this part of the docs for more information.