Question

How to convert timedelta to integer in pandas dataframe

I am trying to convert timedelta to integer.

time = (pd.to_datetime(each_date2)-pd.to_datetime(each_date1))
pd.to_numeric(time, downcast='integer')

time has following value:

Timedelta('7 days 00:00:00')

I am getting following error on second line:

TypeError: Invalid object type at position 0

Any help would be greatly appreciated. Thanks in advance.

 2  45  2
1 Jan 1970

Solution

 3

Scalar pandas.Timedelta objects

Assuming you want the time difference in days you can use the pandas.Timedelta.days attribute. For example:

# Create the data
import pandas as pd
each_date1 = '2024-07-17'
each_date2 = '2024-07-24'
time = pd.to_datetime(each_date2) - pd.to_datetime(each_date1)
print(time) # 7 days 00:00:00

Then it's simply:

time.days # 7

This is already an integer so there's no need to downcast. In fact, the downcast seems to have no effect. For example, if you want seconds:

pd.to_numeric(time.total_seconds(), downcast="integer") # 604800.0

The result is still a float. If you want an integer you can do:

int(time.total_seconds()) # 604800

Series of pandas.Timedelta objects

The same principle applies applies if each_date1 and each_date2 are pd.Series. However, you need to use pandas.Series.dt to access the datetime properties of the series. For example:

each_date1 = pd.Series(['2023-07-01 12:00:00', '2023-07-05 08:00:00', '2023-07-10 18:00:00'])
each_date2 = pd.Series(['2023-07-08 12:00:00', '2023-07-10 10:00:00', '2023-07-15 20:00:00'])
time = pd.to_datetime(each_date2) - pd.to_datetime(each_date1)
print(time)
# 0   7 days 00:00:00
# 1   5 days 02:00:00
# 2   5 days 02:00:00
# dtype: timedelta64[ns]

To access the days:

time.dt.days
# 0    7
# 1    5
# 2    5
# dtype: int64

To calculate the seconds and cast to integer:

time.dt.total_seconds().astype(int)
# 0    604800
# 1    439200
# 2    439200
# dtype: int64
2024-07-25
SamR