Skip to main content
correct time complexity of `remove` operations as pointed out in comments
Source Link
gazoh
  • 3.4k
  • 10
  • 26

Coding style

I suppose your Solution class and function signature are not your call but were imposed by your programming exercise environment (I'm not familiar with GeeksForGeeks, but it looks very similar to LeetCode).

As such, the right call is to stick to these conventions, which you did and which is fine.

However, these conventions differ from the conventions recommended by PEP8. In case you want to practice your coding style as well, there are a few changes that you should make:

  • Rename your method name using snake case: min_cost
  • Document your class and methods using docstrings, not comments: see PEP257
  • Remove unused variables, in this case the n argument which serve no purpose.

Also, some changes should be made even in the context of whatever conversion you are following:

  • there is no need to initialize cost outside of its scope
  • rename arr2 with a more descriptive name, perhaps costs
  • the loop counter x is unused. Such throwaway variables are conventionally named _
  • add some space around operators: cost = arr[0] + arr[1], return sum(arr2)

Performance

Small improvements

Get rid of arr2. You don't need to keep track of all individual costs, only their total.

As such, you can simply initialize total_cost = 0 and add the latest cost to the total instead of appending it to the list. Finally, simply return that value.

This matters especially on larger scales, are as values are appended to the list, memory is regularly allocated to fit the data, which is rather slow. Updating the value of a single integer requires no further allocation.

Algorithm

The problem states:

Expected Time Complexity : O(nlogn)

In your case, you loop over the size of the array (o(n)) and sort a list in each loop (o(m logm)). Since your list size decreases with each pass, the time complexity isn't quite o(n²logn), and in fact quick testing shows it is approximately o(n²). There lies your performance issue.

Appending and removing from the end of lists is o(1) in Python, so thesethe append operations should scale fairly well (o(n), due to the loop). However, the remove operations on the start of the list are o(n), so you should find a way to do without them.

In fact, on a fairly large input, profiling shows that >90% of the time is spent sorting lists.

The obvious target here is to get rid of the sort. One possibility is to insert the cost into the right position in the array, saving quite some time. On my machine, execution for a list of size 100,000 dropped from 19 seconds to 5, an appreciable improvement. However, the algorithm is still o(n²) overall and won't scale nicely with larger inputs.

Unfortunately, I can't think of a solution right now to hit the o(n logn) target. Try to think outside of the box, and good luck.

Coding style

I suppose your Solution class and function signature are not your call but were imposed by your programming exercise environment (I'm not familiar with GeeksForGeeks, but it looks very similar to LeetCode).

As such, the right call is to stick to these conventions, which you did and which is fine.

However, these conventions differ from the conventions recommended by PEP8. In case you want to practice your coding style as well, there are a few changes that you should make:

  • Rename your method name using snake case: min_cost
  • Document your class and methods using docstrings, not comments: see PEP257
  • Remove unused variables, in this case the n argument which serve no purpose.

Also, some changes should be made even in the context of whatever conversion you are following:

  • there is no need to initialize cost outside of its scope
  • rename arr2 with a more descriptive name, perhaps costs
  • the loop counter x is unused. Such throwaway variables are conventionally named _
  • add some space around operators: cost = arr[0] + arr[1], return sum(arr2)

Performance

Small improvements

Get rid of arr2. You don't need to keep track of all individual costs, only their total.

As such, you can simply initialize total_cost = 0 and add the latest cost to the total instead of appending it to the list. Finally, simply return that value.

This matters especially on larger scales, are as values are appended to the list, memory is regularly allocated to fit the data, which is rather slow. Updating the value of a single integer requires no further allocation.

Algorithm

The problem states:

Expected Time Complexity : O(nlogn)

In your case, you loop over the size of the array (o(n)) and sort a list in each loop (o(m logm)). Since your list size decreases with each pass, the time complexity isn't quite o(n²logn), and in fact quick testing shows it is approximately o(n²). There lies your performance issue.

Appending and removing from lists is o(1) in Python, so these operations should scale fairly well (o(n), due to the loop).

In fact, on a fairly large input, profiling shows that >90% of the time is spent sorting lists.

The obvious target here is to get rid of the sort. One possibility is to insert the cost into the right position in the array, saving quite some time. On my machine, execution for a list of size 100,000 dropped from 19 seconds to 5, an appreciable improvement. However, the algorithm is still o(n²) overall and won't scale nicely with larger inputs.

Unfortunately, I can't think of a solution right now to hit the o(n logn) target. Try to think outside of the box, and good luck.

Coding style

I suppose your Solution class and function signature are not your call but were imposed by your programming exercise environment (I'm not familiar with GeeksForGeeks, but it looks very similar to LeetCode).

As such, the right call is to stick to these conventions, which you did and which is fine.

However, these conventions differ from the conventions recommended by PEP8. In case you want to practice your coding style as well, there are a few changes that you should make:

  • Rename your method name using snake case: min_cost
  • Document your class and methods using docstrings, not comments: see PEP257
  • Remove unused variables, in this case the n argument which serve no purpose.

Also, some changes should be made even in the context of whatever conversion you are following:

  • there is no need to initialize cost outside of its scope
  • rename arr2 with a more descriptive name, perhaps costs
  • the loop counter x is unused. Such throwaway variables are conventionally named _
  • add some space around operators: cost = arr[0] + arr[1], return sum(arr2)

Performance

Small improvements

Get rid of arr2. You don't need to keep track of all individual costs, only their total.

As such, you can simply initialize total_cost = 0 and add the latest cost to the total instead of appending it to the list. Finally, simply return that value.

This matters especially on larger scales, are as values are appended to the list, memory is regularly allocated to fit the data, which is rather slow. Updating the value of a single integer requires no further allocation.

Algorithm

The problem states:

Expected Time Complexity : O(nlogn)

In your case, you loop over the size of the array (o(n)) and sort a list in each loop (o(m logm)). Since your list size decreases with each pass, the time complexity isn't quite o(n²logn), and in fact quick testing shows it is approximately o(n²). There lies your performance issue.

Appending and removing from the end of lists is o(1) in Python, so the append operations should scale fairly well (o(n), due to the loop). However, the remove operations on the start of the list are o(n), so you should find a way to do without them.

In fact, on a fairly large input, profiling shows that >90% of the time is spent sorting lists.

The obvious target here is to get rid of the sort. One possibility is to insert the cost into the right position in the array, saving quite some time. On my machine, execution for a list of size 100,000 dropped from 19 seconds to 5, an appreciable improvement. However, the algorithm is still o(n²) overall and won't scale nicely with larger inputs.

Unfortunately, I can't think of a solution right now to hit the o(n logn) target. Try to think outside of the box, and good luck.

Source Link
gazoh
  • 3.4k
  • 10
  • 26

Coding style

I suppose your Solution class and function signature are not your call but were imposed by your programming exercise environment (I'm not familiar with GeeksForGeeks, but it looks very similar to LeetCode).

As such, the right call is to stick to these conventions, which you did and which is fine.

However, these conventions differ from the conventions recommended by PEP8. In case you want to practice your coding style as well, there are a few changes that you should make:

  • Rename your method name using snake case: min_cost
  • Document your class and methods using docstrings, not comments: see PEP257
  • Remove unused variables, in this case the n argument which serve no purpose.

Also, some changes should be made even in the context of whatever conversion you are following:

  • there is no need to initialize cost outside of its scope
  • rename arr2 with a more descriptive name, perhaps costs
  • the loop counter x is unused. Such throwaway variables are conventionally named _
  • add some space around operators: cost = arr[0] + arr[1], return sum(arr2)

Performance

Small improvements

Get rid of arr2. You don't need to keep track of all individual costs, only their total.

As such, you can simply initialize total_cost = 0 and add the latest cost to the total instead of appending it to the list. Finally, simply return that value.

This matters especially on larger scales, are as values are appended to the list, memory is regularly allocated to fit the data, which is rather slow. Updating the value of a single integer requires no further allocation.

Algorithm

The problem states:

Expected Time Complexity : O(nlogn)

In your case, you loop over the size of the array (o(n)) and sort a list in each loop (o(m logm)). Since your list size decreases with each pass, the time complexity isn't quite o(n²logn), and in fact quick testing shows it is approximately o(n²). There lies your performance issue.

Appending and removing from lists is o(1) in Python, so these operations should scale fairly well (o(n), due to the loop).

In fact, on a fairly large input, profiling shows that >90% of the time is spent sorting lists.

The obvious target here is to get rid of the sort. One possibility is to insert the cost into the right position in the array, saving quite some time. On my machine, execution for a list of size 100,000 dropped from 19 seconds to 5, an appreciable improvement. However, the algorithm is still o(n²) overall and won't scale nicely with larger inputs.

Unfortunately, I can't think of a solution right now to hit the o(n logn) target. Try to think outside of the box, and good luck.