Calmcode - pandas datetime: introduction

Convert columns into dates in pandas with pd.to_datetime.

1 2 3 4 5 6

Pandas is great at dealing with datetimes. Before you can use all the available timeseries features though, you'll need to cast strings into datetime objects that pandas can deal with.

Installation

In this series of videos we'll be using pandas version 1.3.4. You can install this version in your notebook via;

%pip install pandas==1.3.4

Checking the Types

Let's first load in a csv file that contains a date column.

import pandas as pd

df = pd.read_csv("https://calmcode.io/datasets/birthdays.csv")

When you check the types, you can confirm that the "date" column is not a datetime compatible column.

df.dtypes

This command returns:

state     object
year       int64
month      int64
day        int64
date      object
wday      object
births     int64
dtype: object

At the moment the "date" column is of type "object". Let's change that.

Example: convert string with pd.to_datetime

You can convert strings to pandas compatible dates via:

df.assign(date=lambda d: pd.to_datetime(d['date']))

This changes the types!

df.assign(date=lambda d: pd.to_datetime(d['date'])).dtypes

This is the result.

state             object
year               int64
month              int64
day                int64
date      datetime64[ns]
wday              object
births             int64
dtype: object

Speedup by setting the format

Typically, we recommend setting the format upfront when you're casting to a datetime. The reason is that it's typically much faster, although you may not notice unless there's you're dealing with a big dataframe.

Here's an example of casting to a datetime with a format.

df.assign(date=lambda d: pd.to_datetime(d['date'], format="%Y-%m-%d"))