While merging two df, it seems to be adding duplicates rows some how.
I do need to keep the exactly number of rows of db in the left.
data:
# main data
df = pd.DataFrame({ "campaign_name": ["111","222","333"], "leads": [1, 2, 1] })
# reff table
dim_campaign = pd.DataFrame({ "campaign_name": ["111","222","333"], "Type": ["a", "b" , "c"] })
# counting number leads
df.campaign_name.value_counts()
my code:
The problem is.. after merging and verify number of rows has increase. I do want keep all the original rows of "df" and just add the info of columns that matches.
df = df.groupby("campaign_name")["leads"].sum()
df = pd.merge(df, dim_campaign[["campaign_name", "Type"]],on='campaign_name', how='left')
x =df.loc[df.campaign_name=="222"]
x.leads.sum()
# it gives a higher value