0

I need to create a new column and the value should be:

the current fair_price - fair_price 15 minutes ago(or the closest row)

I need to filter who is the row 15 minutes before then calculate the diff.

import numpy as np
import pandas as pd
from datetime import timedelta

df = pd.DataFrame(pd.read_csv('./data.csv'))


def calculate_15min(row):
    end_date = pd.to_datetime(row['date']) - timedelta(minutes=15)
    mask = (pd.to_datetime(df['date']) <= end_date).head(1)
    price_before = df.loc[mask]
    return price_before['fair_price']


def calc_new_val(row):
    return 'show date 15 minutes before, maybe it will be null, nope'


df['15_min_ago'] = df.apply(lambda row: calculate_15min(row), axis=1)

myFields = ['pkey_id', 'date', '15_min_ago', 'fair_price']
print(df[myFields].head(5))
df[myFields].head(5).to_csv('output.csv', index=False)


I did it using nodejs but python is not my beach, maybe you have a fast solution...

pkey_id,date,fair_price,15_min_ago
465620,2021-05-17 12:28:30,45080.23,fair_price_15_min_before
465625,2021-05-17 12:28:35,45060.17,fair_price_15_min_before
465629,2021-05-17 12:28:40,45052.74,fair_price_15_min_before
465633,2021-05-17 12:28:45,45043.89,fair_price_15_min_before
465636,2021-05-17 12:28:50,45040.93,fair_price_15_min_before
465640,2021-05-17 12:28:56,45049.95,fair_price_15_min_before
465643,2021-05-17 12:29:00,45045.38,fair_price_15_min_before
465646,2021-05-17 12:29:05,45039.87,fair_price_15_min_before
465650,2021-05-17 12:29:10,45045.55,fair_price_15_min_before
465652,2021-05-17 12:29:15,45042.53,fair_price_15_min_before
465653,2021-05-17 12:29:20,45039.34,fair_price_15_min_before
466377,2021-05-17 12:42:50,45142.74,fair_price_15_min_before
466380,2021-05-17 12:42:55,45143.24,fair_price_15_min_before
466393,2021-05-17 12:43:00,45130.98,fair_price_15_min_before
466398,2021-05-17 12:43:05,45128.13,fair_price_15_min_before
466400,2021-05-17 12:43:10,45140.9,fair_price_15_min_before
466401,2021-05-17 12:43:15,45136.38,fair_price_15_min_before
466404,2021-05-17 12:43:20,45118.54,fair_price_15_min_before
466405,2021-05-17 12:43:25,45120.69,fair_price_15_min_before
466407,2021-05-17 12:43:30,45121.37,fair_price_15_min_before
466413,2021-05-17 12:43:36,45133.71,fair_price_15_min_before
466415,2021-05-17 12:43:40,45137.74,fair_price_15_min_before
466419,2021-05-17 12:43:45,45127.96,fair_price_15_min_before
466431,2021-05-17 12:43:50,45100.83,fair_price_15_min_before
466437,2021-05-17 12:43:55,45091.78,fair_price_15_min_before
466438,2021-05-17 12:44:00,45084.75,fair_price_15_min_before
466445,2021-05-17 12:44:06,45094.08,fair_price_15_min_before
466448,2021-05-17 12:44:10,45106.51,fair_price_15_min_before
466456,2021-05-17 12:44:15,45122.97,fair_price_15_min_before
466461,2021-05-17 12:44:20,45106.78,fair_price_15_min_before
466466,2021-05-17 12:44:25,45096.55,fair_price_15_min_before
466469,2021-05-17 12:44:30,45088.06,fair_price_15_min_before
466474,2021-05-17 12:44:35,45086.12,fair_price_15_min_before
466491,2021-05-17 12:44:40,45065.95,fair_price_15_min_before
466495,2021-05-17 12:44:45,45068.21,fair_price_15_min_before
466502,2021-05-17 12:44:55,45066.47,fair_price_15_min_before
466506,2021-05-17 12:45:00,45063.82,fair_price_15_min_before
466512,2021-05-17 12:45:05,45070.48,fair_price_15_min_before
466519,2021-05-17 12:45:10,45050.59,fair_price_15_min_before
466523,2021-05-17 12:45:16,45041.13,fair_price_15_min_before
466526,2021-05-17 12:45:20,45038.36,fair_price_15_min_before
466535,2021-05-17 12:45:25,45029.72,fair_price_15_min_before
466553,2021-05-17 12:45:31,45016.2,fair_price_15_min_before
466557,2021-05-17 12:45:35,45011.2,fair_price_15_min_before
466559,2021-05-17 12:45:40,45007.04,fair_price_15_min_before

This is the CSV

2
  • In your sample dataset there is not a single 'date' matches with dates of '15_min_ago' so how can you calculate the difference between fair price and fair price 15 min ago? Commented May 26, 2021 at 4:55
  • hello friend, yes you are right. I edit it and now we have enough data to do.. I just need to get the first data <= 15 minutes ago and calc the diff between fair_price Commented May 26, 2021 at 5:47

1 Answer 1

1

Firstly convert your date column to datetime dtype:

df['date']=pd.to_datetime(df['date'])

Then filter values:

date15min=df['date']-pd.offsets.DateOffset(minutes=15)
out=df.loc[df['date'].isin(date15min.tolist())]

Now Finally do your calculations:

df['price_before_15min']=df['fair_price'].where(df['date'].isin((out['date']+pd.offsets.DateOffset(minutes=15)).tolist()))
df['price_before_15min']=df['price_before_15min'].diff()
df['date_before_15min']=date15min

Now If you print df you will get your desired output

Update:

For that purpose just make a slightly change in the above method:

out=df.loc[df['date'].dt.minute.isin(date15min.dt.minute.tolist())]
df['price_before_15min']=df['fair_price'].where(df['date'].dt.minute.isin((out['date']+pd.offsets.DateOffset(minutes=15)).dt.minute.tolist()))
Sign up to request clarification or add additional context in comments.

4 Comments

Hello friend, thank you for you help but it still not ok... there is a row with time 12:43:30 then we can get 15 min before = 12:28:30(first row) to calculate the diff but in this example we need to set null for all column before 12:43:30 because we cannot filter the 15 min before do you agree? I can send you a full csv with 600 lines but I don't know how to do it using stackoverflow
Friend Updated answer....kindly have a look and let me know if you are getting your expected output or not if you will be getting your output then I will add explainations :)
Yes bro amazing. You rock. But actually I didn't explain it right... after the first 15 minutes, the right would be get the next value if there is 1 second difference.. do you get me ? a lot of rows are being empty because there is the difference like 15:01 instead 15:00 we could get the first row => 15 minutes difference...
Hey Friend...Updated answer....kindly have a look and let me know if it works or not :)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.