14

Problem:

What'd I like to do is step-by-step reduce a value in a Series by a continuously decreasing base figure.

I'm not sure of the terminology for this - I did think I could do something with cumsum and diff but I think I'm leading myself on a wild goose chase there...

Starting code:

import pandas as pd

ALLOWANCE = 100
values = pd.Series([85, 10, 25, 30])

Desired output:

desired = pd.Series([0, 0, 20, 30])

Rationale:

Starting with a base of ALLOWANCE - each value in the Series is reduced by the amount remaining, as is the allowance itself, so the following steps occur:

  • Start with 100, we can completely remove 85 so it becomes 0, we now have 15 left as ALLOWANCE
  • The next value is 10 and we still have 15 available, so this becomes 0 again and we have 5 left.
  • The next value is 25 - we only have 5 left, so this becomes 20 and now we have no further allowance.
  • The next value is 30, and since there's no allowance, the value remains as 30.
1
  • I would rename the values variable into expenses and the desired variable into debts, which in combination with allowance makes the reader understand what you are trying to accomplish without even looking at the text, imo. Commented Feb 23, 2015 at 22:44

4 Answers 4

9

Following your initial idea of cumsum and diff, you could write:

>>> (values.cumsum() - ALLOWANCE).clip_lower(0).diff().fillna(0)
0     0
1     0
2    20
3    30
dtype: float64

This is the cumulative sum of values minus the allowance. Negative values are clipped to zeros (since we don't care about numbers until we have overdrawn our allowance). From there, you can calculate the difference.

However, if the first value might be greater than the allowance, the following two-line variation is preferred:

s = (values.cumsum() - ALLOWANCE).clip_lower(0)
desired = s.diff().fillna(s)

This fills the first NaN value with the "first value - allowance" value. So in the case where ALLOWANCE is lowered to 75, it returns desired as Series([10, 10, 25, 30]).

Sign up to request clarification or add additional context in comments.

4 Comments

This doesn't appear to handle where the first element of the Series > ALLOWANCE :(
@JonClements you just need to append .fillna(0)
@EdChum cant' do that - think I need to use similar to Carsten's answer, if the first value in the series remains 85, and the ALLOWANCE is 70, the result is 0 - which is incorrect - it should be 15
I've gone for a hybrid between yours and Carsten's answer - I like the clip_lower() in this one, even though Carsten was the first to point out that .fillna(0) would yield incorrect results. (although you have corrected that - thanks)
8

Your idea with cumsum and diff works. It doesn't look too complicated; not sure if there's an even shorter solution. First, we compute the cumulative sum, operate on that, and then go back (diff is kinda sorta the inverse function of cumsum).

import math

c = values.cumsum() - ALLOWANCE
# now we've got [-15, -5, 20, 50]
c[c < 0] = 0 # negative values don't make sense here

# (c - c.shift(1)) # <-- what I had first: diff by accident

# it is important that we don't fill with 0, in case that the first
# value is greater than ALLOWANCE
c.diff().fillna(math.max(0, values[0] - ALLOWANCE))

Comments

5

This is probably not so performant but at the moment this is a Pandas way of doing this using rolling_apply:

In [53]:

ALLOWANCE = 100
def reduce(x):
    global ALLOWANCE
    # short circuit if we've already reached 0
    if ALLOWANCE == 0:
        return x
    val = max(0, x - ALLOWANCE)
    ALLOWANCE = max(0, ALLOWANCE - x)
    return val

pd.rolling_apply(values, window=1, func=reduce)
Out[53]:
0     0
1     0
2    20
3    30
dtype: float64

Or more simply:

In [58]:

values.apply(reduce)
Out[58]:
0     0
1     0
2    20
3    30
dtype: int64

2 Comments

There is likely a better way to rewrite my function, I'm not a python expert, I was thinking that this could be rewritten using a generator but it didn't quite work for some reason. Ideally I'd short circuit this if the allowance is already 0 and return the passed in row value
It's certainly pointed me in what looks like the right direction and given me some ideas... thanks for very much - reading up on rolling_apply now
1

It should work with a while loop :

ii = 0
while (ALLOWANCE > 0 and ii < len(values)):
    if (ALLOWANCE > values[ii]):
        ALLOWANCE -= values[ii]
        values[ii] = 0
    else:
        values[ii] -= ALLOWANCE
        ALLOWANCE = 0
    ii += 1 

1 Comment

Thanks. While this will work, I'm also planning on doing other operations in pandas - so I'm really after a pandas based solution if possible.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.