0

I am trying to optimize worker's schedules, based on the following dataframe.

    Time Windows Shift 1 Shift 2 Shift 3 Shift 4  Workers Required
0    6:00 - 9:00       1       0       0       1              55.0
1   9:00 - 12:00       1       0       0       0              46.0
2  12:00 - 15:00       1       1       0       0              59.0
3  15:00 - 18:00       0       1       0       0              23.0
4  18:00 - 21:00       0       1       1       0              60.0
5  21:00 - 24:00       0       0       1       0              38.0
6   24:00 - 3:00       0       0       1       1              20.0
7    3:00 - 6:00       0       0       0       1              30.0
8      Wage_Rate     135     140     190     188               0.0

First (create dataframe):

import pandas as pd
df = pd.read_clipboard(sep='\\s+')
df = pd.DataFrame(df)

Here is the code that I am testing.

import pandas as pd
import pulp
from pulp import LpMaximize, LpMinimize, LpProblem, LpStatus, lpSum, LpVariable
import numpy as np


df = df.fillna(0).applymap(lambda x: 1 if x == "X" else x)

df.set_index('Time Windows')
a = df.drop(columns=["Workers Required"]).values
a


df.drop(df.tail(1).index,inplace=True)
print(df.shape)


df = df.fillna(0).applymap(lambda x: 1 if x == "X" else x)
print(df.shape)


a = df.to_numpy()
a


# number of shifts
n = a.shape[0]


# number of time windows
T = a.shape[0]


# number of workers required per time window
d = df["Workers Required"].values


# wage rate per shift
#Get last row of dataframe
last_row = df.iloc[-1:,1:]
#Get last row of dataframe as numpy array
w = last_row.to_numpy()
w


# Decision variables
y = LpVariable.dicts("num_workers", list(range(n)), lowBound=0, cat="Integer")
y


# Create problem
prob = LpProblem("scheduling_workers", LpMinimize)



prob += lpSum([w[j] * y[j] for j in range(n)])


for t in range(T):
    prob += lpSum([a[t, j] * y[j] for j in range(n)]) >= d[t]


prob.solve()
print("Status:", LpStatus[prob.status])


for shift in range(n):
    print(f"The number of workers needed for shift {shift} is {int(y[shift].value())} workers")

When I get to this line:

prob += lpSum([w[j] * y[j] for j in range(n)])

I get this error.

Traceback (most recent call last):

  Cell In[197], line 1
    prob += lpSum([w[j] * y[j] for j in range(n)])

  Cell In[197], line 1 in <listcomp>
    prob += lpSum([w[j] * y[j] for j in range(n)])

IndexError: index 1 is out of bounds for axis 0 with size 1

The example I am trying to follow is from the link below.

https://towardsdatascience.com/how-to-solve-a-staff-scheduling-problem-with-python-63ae50435ba4

3
  • 1
    Please start by following standard debugging procedures, and attempt to figure out where the problem seems to be; then ask a specific question about where you are actually confused. We do not "find the bug" here. Commented Apr 3, 2023 at 23:57
  • Check value of n Commented Apr 4, 2023 at 3:41
  • You should check the shape of w and y and make sure that you are multiplying along the correct axis. Commented Apr 4, 2023 at 3:51

1 Answer 1

1

Your problems mostly come from misuse of Pandas. Use sane slicing, and it works fine:

from io import StringIO

import pandas as pd
import pulp

with StringIO(
'''Time Windows,Shift 1,Shift 2,Shift 3,Shift 4,Workers Required
    6:00 - 9:00,      1,      0,      0,      1,            55.0
   9:00 - 12:00,      1,      0,      0,      0,            46.0
  12:00 - 15:00,      1,      1,      0,      0,            59.0
  15:00 - 18:00,      0,      1,      0,      0,            23.0
  18:00 - 21:00,      0,      1,      1,      0,            60.0
  21:00 - 24:00,      0,      0,      1,      0,            38.0
   24:00 - 3:00,      0,      0,      1,      1,            20.0
    3:00 - 6:00,      0,      0,      0,      1,            30.0
      Wage_Rate,    135,    140,    190,    188,             0.0''') as f:
    df = pd.read_csv(f, skipinitialspace=True, index_col='Time Windows')

is_shift = df.columns.str.startswith('Shift')
is_wage = df.index == 'Wage_Rate'
shifts = df.loc[~is_wage, is_shift]
wage_rate = df.loc[is_wage, is_shift].squeeze()
workers_req = df.loc[~is_wage, 'Workers Required']

workers_per_shift = pulp.LpVariable.dicts(name='workers_per', indices=shifts.columns, lowBound=0, cat=pulp.LpInteger)
prob = pulp.LpProblem(name='scheduling_workers', sense=pulp.LpMinimize)
prob.objective = pulp.lpDot(wage_rate, workers_per_shift.values())

for (time, shift), worker_req in zip(shifts.iterrows(), workers_req):
    prob.addConstraint(name=f'workers_min_{time}', constraint=pulp.lpDot(shift, workers_per_shift.values()) >= worker_req)

print(prob)
prob.solve()

for k, v in workers_per_shift.items():
    print(f'{k} has {v.value():.0f} workers')
scheduling_workers:
MINIMIZE
135*workers_per_Shift_1 + 140*workers_per_Shift_2 + 190*workers_per_Shift_3 + 188*workers_per_Shift_4 + 0
SUBJECT TO
workers_min_6:00___9:00: workers_per_Shift_1 + workers_per_Shift_4 >= 55

workers_min_9:00___12:00: workers_per_Shift_1 >= 46

workers_min_12:00___15:00: workers_per_Shift_1 + workers_per_Shift_2 >= 59

workers_min_15:00___18:00: workers_per_Shift_2 >= 23

workers_min_18:00___21:00: workers_per_Shift_2 + workers_per_Shift_3 >= 60

workers_min_21:00___24:00: workers_per_Shift_3 >= 38

workers_min_24:00___3:00: workers_per_Shift_3 + workers_per_Shift_4 >= 20

workers_min_3:00___6:00: workers_per_Shift_4 >= 30

VARIABLES
0 <= workers_per_Shift_1 Integer
0 <= workers_per_Shift_2 Integer
0 <= workers_per_Shift_3 Integer
0 <= workers_per_Shift_4 Integer

Welcome to the CBC MILP Solver 
Version: 2.10.3 
Build Date: Dec 15 2019 

command line - C:\Users\gtoom\src\stackexchange\.venv\lib\site-packages\pulp\solverdir\cbc\win\64\cbc.exe C:\Users\gtoom\AppData\Local\Temp\38759563a6c3439b8221b857b9f617af-pulp.mps timeMode elapsed branch printingOptions all solution C:\Users\gtoom\AppData\Local\Temp\38759563a6c3439b8221b857b9f617af-pulp.sol (default strategy 1)
At line 2 NAME          MODEL
At line 3 ROWS
At line 13 COLUMNS
At line 38 RHS
At line 47 BOUNDS
At line 52 ENDATA
Problem MODEL has 8 rows, 4 columns and 12 elements
Coin0008I MODEL read with 0 errors
Option for timeMode changed from cpu to elapsed
Continuous objective value is 22290 - 0.00 seconds
Cgl0004I processed model has 0 rows, 0 columns (0 integer (0 of which binary)) and 0 elements
Cbc3007W No integer variables - nothing to do
Cuts at root node changed objective from 22290 to -1.79769e+308
Probing was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Gomory was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Knapsack was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
Clique was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
MixedIntegerRounding2 was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
FlowCover was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
TwoMirCuts was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)
ZeroHalf was tried 0 times and created 0 cuts of which 0 were active after adding rounds of cuts (0.000 seconds)

Result - Optimal solution found

Objective value:                22290.00000000
Enumerated nodes:               0
Total iterations:               0
Time (CPU seconds):             0.00
Time (Wallclock seconds):       0.00

Option for printingOptions changed from normal to all
Total time (CPU seconds):       0.01   (Wallclock seconds):       0.01

Shift 1 has 46 workers
Shift 2 has 23 workers
Shift 3 has 38 workers
Shift 4 has 30 workers
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.