Pandas is great at dealing with datetimes. Before you can use all the available timeseries features though, you'll need to cast strings into datetime objects that pandas can deal with.
In this series of videos we'll be using pandas version 1.3.4. You can install this version in your notebook via;
%pip install pandas==1.3.4
Checking the Types
Let's first load in a csv file that contains a date column.
import pandas as pd df = pd.read_csv("https://calmcode.io/datasets/birthdays.csv")
When you check the types, you can confirm that the
is not a datetime compatible column.
This command returns:
state object year int64 month int64 day int64 date object wday object births int64 dtype: object
At the moment the
"date" column is of type "object". Let's change that.
Example: convert string with
You can convert strings to pandas compatible dates via:
df.assign(date=lambda d: pd.to_datetime(d['date']))
This changes the types!
df.assign(date=lambda d: pd.to_datetime(d['date'])).dtypes
This is the result.
state object year int64 month int64 day int64 date datetime64[ns] wday object births int64 dtype: object
Speedup by setting the
Typically, we recommend setting the
format upfront when you're
casting to a datetime. The reason is that it's typically much
faster, although you may not notice unless there's you're dealing
with a big dataframe.
Here's an example of casting to a datetime with a format.
df.assign(date=lambda d: pd.to_datetime(d['date'], format="%Y-%m-%d"))