I am solving an integer programming model by importing Cplex as a library in Python. Let's say the optimization problem has a constraint in the following form (Ax = b)
:
x0+x1+x1+x3 = 1
The indices of the x variables in this constraint are 0,1,1, and 3. These are stored in a list: indices=[0,1,1,3]
The coefficients of these variables are also stored in another list coeff = [1,1,1,1]
Cplex cannot accept duplicate indices, so the constraint should look like this:
x0+2x1+x3 = 1
so the two lists should update like this:
indices=[0,1,3]
coeff = [1,2,1]
I have this function that takes indices and coefficients as two arguments and gives me the manipulated lists:
def manipulate(indices, coeff):
u = np.unique(indices)
sums = { ui:np.sum([ coeff[c[0]] for c in np.argwhere(indices == ui) ]) for ui in u }
return list(sums.keys()),list(sums.values())
So, manipulate([0,1,1,3], [1,1,1,1])
returns ([0, 1, 3], [1, 2, 1])
.
My problem is when I have so many variables, the lists can have a length of a million and I have millions of such constraints. And when I solve my optimization problem using cplex, the program becomes very slow. I tracked the time spent in each function and realized the most time consuming parts of my code are these calculations and I guess it is because of using numpy. I need to make this function more efficient to hopefully decrease the execution time. could you please share any thoughts and suggestions with me on how to change the function manipulate?
Thanks a lot,
manipulate
, but with that comprehension andargwhere
it clearly is slow. It's ok if it is used once when setting up the problem, but should not be used repeatedly in code called bycplex
.