3

I have 2 Dataframes, I need to return the Value from df_A["Cycle"] in df_B, if the df_B["Date"] fall in between range of df_B["From_Date"] & df_B["To_Date"]


df_A:                  df_B:

Date         Cycle     From_Date    To_Date 
07.02.2021    C01     07.02.2021  13.02.2021 
08.02.2021    C01     14.02.2021  27.02.2021
14.02.2021    C02     28.02.2021  03.03.2021 
15.06.2021    C02      
28.02.2021    C03      

Desired Output:

Df B:

From_Date    To_Date    Cycle
07.02.2021  13.02.2021   C01
14.02.2021  27.02.2021   C02
28.02.2021  03.03.2021   C03 

So far i tried using np.dot but it return an shape - Value Error. I found this piece of code online

s1=Promo_Data["Date From"].values
s2=Promo_Data["Date to"].values
s=Cycle_Mapping["Date"].values[:,None]
Promo_Data["Cyc"]=np.dot((s>=s1)&(s<=s2),Cycle_Mapping["Cycle"])
1
  • df_A all fall in df_B; could you kindly explain your logic better (08.02.2021 faills between 07.02.2012 and 13.02.2021, yet it is excluded)
    – sammywemmy
    Commented Jul 9, 2021 at 0:14

1 Answer 1

1

df1:

        Date Cycle
0 2021-02-07   C01
1 2021-02-08   C01
2 2021-02-14   C02
3 2021-06-15   C02
4 2021-02-28   C03

df2:

   From_Date    To_Date
0 2021-02-07 2021-02-13
1 2021-02-14 2021-02-27
2 2021-02-28 2021-03-03

First, let's make sure that dates are of datetime type:

df1['Date'] = pd.to_datetime(df1['Date'], format='%d.%m.%Y')
df2['From_Date'] = pd.to_datetime(df2['From_Date'], format='%d.%m.%Y')
df2['To_Date'] = pd.to_datetime(df2['To_Date'], format='%d.%m.%Y')

Construct IntervalIndex for df2:

>>> df2.index = pd.IntervalIndex.from_arrays(df2['From_Date'], df2['To_Date'],closed='both')
>>> df2

                          From_Date    To_Date
[2021-02-07, 2021-02-13] 2021-02-07 2021-02-13
[2021-02-14, 2021-02-27] 2021-02-14 2021-02-27
[2021-02-28, 2021-03-03] 2021-02-28 2021-03-03

Define function to map Date in df1 to the range of dates in df2, and compute new column in df1 to store this range:

def get_date(d):
    try:
        return df2.loc[d].name
    except KeyError:
        pass

df1['index'] = df1['Date'].apply(get_date)

output:

        Date Cycle                     index
0 2021-02-07   C01  [2021-02-07, 2021-02-13]
1 2021-02-08   C01  [2021-02-07, 2021-02-13]
2 2021-02-14   C02  [2021-02-14, 2021-02-27]
3 2021-06-15   C02                       NaN
4 2021-02-28   C03  [2021-02-28, 2021-03-03]

Merge the two dataframes on "index" and filter the columns:

df2.reset_index().merge(df1, on='index')[['From_Date', 'To_Date', 'Cycle']]

   From_Date    To_Date Cycle
0 2021-02-07 2021-02-13   C01
1 2021-02-07 2021-02-13   C01
2 2021-02-14 2021-02-27   C02
3 2021-02-28 2021-03-03   C03

If you really want to merge only on the first df1 value for each range you can groupby and keep the first, assuming the merge is now df3:

df3.groupby(['From_Date', 'To_Date'], as_index=False).first()

output:

   From_Date    To_Date Cycle
0 2021-02-07 2021-02-13   C01
1 2021-02-14 2021-02-27   C02
2 2021-02-28 2021-03-03   C03

Full code:

df1 = pd.DataFrame({'Date': ['02.07.2021', '08.02.2021', '14.02.2021', '15.06.2021', '28.02.2021'],
                    'Cycle': ['C01', 'C01', 'C02', 'C02', 'C03']})
df2 = pd.DataFrame({'From_Date': ['07.02.2021', '14.02.2021', '28.02.2021'],
                    'To_Date': ['13.02.2021', '27.02.2021', '03.03.2021']})

df1['Date'] = pd.to_datetime(df1['Date'], format='%d.%m.%Y')
df2['From_Date'] = pd.to_datetime(df2['From_Date'], format='%d.%m.%Y')
df2['To_Date'] = pd.to_datetime(df2['To_Date'], format='%d.%m.%Y')

df2.index = pd.IntervalIndex.from_arrays(df2['From_Date'], df2['To_Date'], closed='both')

def get_date(d):
    try:
        return df2.loc[d].name
    except KeyError:
        pass

df1['index'] = df1['Date'].apply(get_date)

df3 = df2.reset_index().merge(df1, on='index')[['From_Date', 'To_Date', 'Cycle']]

df3.groupby(['From_Date', 'To_Date'], as_index=False).first()
2
  • exactly what i needed. but i am getting an "name" error("DataFrame" object does not has no attribute "name") in line return df2.loc[d].name
    – Biplab1985
    Commented Jul 10, 2021 at 12:09
  • I had forgotten to copy one of the lines in the full code. Can you try to run it all at once?
    – mozway
    Commented Jul 10, 2021 at 14:59

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.