Pandas is pretty flexible with formats when you read/write csvs. For example, you can read dataframes from a url.
import pandas as pd df = pd.read_csv("https://calmcode.io/datasets/birthdays.csv")
You can save this file to disk.
But! You can also save it to disk as a zip file.
This zipped file is a fair bit lighter than the standard .csv file.
> ls -lhat birthdays* -rw-r--r-- 1 vincentwarmerdam staff 1.6M 3 Jun 21:06 stocks.zip -rw-r--r-- 1 vincentwarmerdam staff 11M 3 Jun 21:06 stocks.csv
.zip file can also be read natively, just like a
pd.read_csv("stocks.zip") == pd.read_csv("stocks.csv")
For very large files with many repeated values this can save
a substantial amount of disk space. These
.zip files can
also be hosted online and downloaded just like the original
Back to TILs.