I was exploring the pandas docs while preparing a pandas course for calmcode.io when I stumbled on an interesting fact: there are bounds for timestamps in pandas.
To quote the docs:
Since pandas represents timestamps in nanosecond resolution, the time span that can be represented using a 64-bit integer is limited to approximately 584 years.
To display it blunty, that means these are the min/max values.
import pandas as pd
pd.Timestamp.min
# Timestamp('1677-09-21 00:12:43.145224193')
pd.Timestamp.max
# Timestamp('2262-04-11 23:47:16.854775807')
It makes sense when you consider pandas can handle nano-seconds and there’s only so much information that you can store in a 64-bit integer. If you have a use-case outside of this span of time, pandas does have a trick up it’s sleeve: you can create a date-like Period
that could work as a datetime instead.
The Period
class.
Here’s how to generate periods.
span = pd.period_range("1215-01-01", "1381-01-01", freq="D")
You can also cast dates manually as an alternative to pd.to_datetime if you like.
s = pd.Series(['1111-01-01', '1212-12-12'])
def convert(item):
year = int(item[:4])
month = int(item[5:7])
day = int(item[8:10])
return pd.Period(year=year, month=month, day=day, freq="D")
s.apply(convert)
# 0 1111-01-01
# 1 1212-12-12
# dtype: period[D]
So if you're doing timeseries on the medieval ages, now you know what to do!
Back to main.