Question

Encountering ValueError upon joining two pandas dataframes on a datetime index column

I have two tables which I need to join on a date column. I want to preserve all the dates in both tables, with the empty rows in each table just being filled with NaNs in the final combined array. I think an outer join is what I'm looking for. So I've written this code (with data_1 and data_2 acting as mockups of my actual tables)

import pandas as pd

def main():
    data_1 = [["May-2024", 10, 5], ["June-2024", 3, 5], ["April-2015", 1, 3]]
    df1 = pd.DataFrame(data_1, columns = ["Date", "A", "B"])
    df1["Date"] = pd.to_datetime(df1["Date"], format="%B-%Y")
    print(df1)

    data_2 = [["01-11-2024", 10, 5], ["01-06-2024", 3, 5], ["01-11-2015", 1, 3]]
    df2 = pd.DataFrame(data_2, columns = ["Date", "C", "D"])
    df2["Date"] = pd.to_datetime(df2["Date"], format="%d-%m-%Y")
    print(df2)


    merged = df1.join(df2, how="outer", on=["Date"])
    print(merged)

if __name__ == "__main__":
    main()

But when I try and perform an outer join on two pandas dataframes, I get the error

ValueError: You are trying to merge on object and int64 columns for key 'Date'. If you wish to proceed you should use pd.concat

I checked the datatype of both columns by printing

print(df1["Date"].dtype, df2["Date"].dtype)

and they both seem to be

datetime64[ns] datetime64[ns]

datetimes. So I'm not quite sure why I'm getting a ValueError

Any help is appreciated, thanks.

 3  36  3
1 Jan 1970

Solution

 1

You need to use merge, not join (that will use the index):

# ensure datetime
df1['Date'] = pd.to_datetime(df1['Date'], format='%B-%Y')
df2['Date'] = pd.to_datetime(df2['Date'], dayfirst=True)

# use merge
merged = df1.merge(df2, how='outer', on=['Date'])

For join to work:

merged_df = df1.set_index('Date').join(df2.set_index('Date')).reset_index()

Output:

        Date     A    B     C    D
0 2015-04-01   1.0  3.0   NaN  NaN
1 2015-11-01   NaN  NaN   1.0  3.0
2 2024-05-01  10.0  5.0   NaN  NaN
3 2024-06-01   3.0  5.0   3.0  5.0
4 2024-11-01   NaN  NaN  10.0  5.0
2024-06-29
mozway