Skip to main content
added 38 characters in body
Source Link
Snowhawk
  • 6.8k
  • 1
  • 19
  • 37
    int length = str.length();

Whenever you cache results like this, you are telling the reader that calculating the length may be expensive or that the size will not change. That doesn't hold true when you erase elements as you encounter them. The size of the string will shrink with each removal. Without adjusting the size, you'll have access violations when you attempt to read beyond the new, smaller size, if a removal occurred.


  for(unsigned int i = 0; i < length; i++) {
    char currChar = str[i]; //holds current character
    for(unsigned int j = i+1; j < length; j++) {
      if(currChar == str[j])
        str.erase (std::remove(str.begin()+j, str.end(), str[j]), str.end());

You seem to have a misunderstanding of how std::remove works. std::remove shifts all values that don't match the value forward. In your code, you remove the duplicates when you encounter the first one, but you keep searching for duplicates (that don't exist) instead of moving on.


The standard library provides std::remove_if, which is a function that will remove elements based on a condition. The remove functions are designed to do the looping while you provide it with either a value or predicate.

Poorly-chosen names can mislead the reader and cause bugs. It is important to use descriptive names that match the semantics and role of the underlying entities, within reason. Are you removing elements (shifting elements forward with unspecified elements in the removal area)? Are you erasing elements (removing elements the destructing the unspecified elements)?

auto remove_duplicates(std::string& str) {
  return std::remove_if(str.begin(), str.end(), predicate);
}

std::string& erase_duplicates(std::string& str) {
  str.erase(remove_duplicates(str), str.end());
  return str;
}

As for the predicate, others have suggested using an array lookup scheme, but that violates the "no additional buffer" requirement of the problem. Another way is to keep track of the area that has already been processed as deduplicated (similar to insertion sort working with its processed range).

auto remove_duplicates(std::string& str) {
  auto deduplicated_end = str.begin();
  auto is_duplicate = [&](char ch) {
    auto found = std::find(str.begin(), deduplicated_end, ch) != deduplicated_end;
    if (!found)
      ++deduplicated_end;
    return found;
  };
  return std::remove_if(str.begin(), str.end(), is_duplicate);
}
    int length = str.length();

Whenever you cache results like this, you are telling the reader that calculating the length may be expensive or that the size will not change. That doesn't hold true when you erase elements as you encounter them. The size of the string will shrink with each removal. Without adjusting the size, you'll have access violations when you attempt to read beyond the new, smaller size, if a removal occurred.


  for(unsigned int i = 0; i < length; i++) {
    char currChar = str[i]; //holds current character
    for(unsigned int j = i+1; j < length; j++) {
      if(currChar == str[j])
        str.erase (std::remove(str.begin()+j, str.end(), str[j]), str.end());

You seem to have a misunderstanding of how std::remove works. std::remove shifts all values that don't match the value forward. In your code, you remove the duplicates when you encounter the first one, but you keep searching for duplicates (that don't exist) instead of moving on.


The standard library provides std::remove_if, which is a function that will remove elements based on a condition. The remove functions are designed to do the looping while you provide it with either a value or predicate.

Poorly-chosen names can mislead the reader and cause bugs. It is important to use descriptive names that match the semantics and role of the underlying entities, within reason. Are you removing elements (shifting elements forward with unspecified elements in the removal area)? Are you erasing elements (removing elements the destructing the unspecified elements)?

auto remove_duplicates(std::string& str) {
  return std::remove_if(str.begin(), str.end(), predicate);
}

std::string& erase_duplicates(std::string& str) {
  str.erase(remove_duplicates(str), str.end());
  return str;
}

As for the predicate, others have suggested using an array lookup scheme, but that violates the "no additional buffer" requirement of the problem. Another way is to keep track of the area that has already been processed as deduplicated (similar to insertion sort).

auto remove_duplicates(std::string& str) {
  auto deduplicated_end = str.begin();
  auto is_duplicate = [&](char ch) {
    auto found = std::find(str.begin(), deduplicated_end, ch) != deduplicated_end;
    if (!found)
      ++deduplicated_end;
    return found;
  };
  return std::remove_if(str.begin(), str.end(), is_duplicate);
}
    int length = str.length();

Whenever you cache results like this, you are telling the reader that calculating the length may be expensive or that the size will not change. That doesn't hold true when you erase elements as you encounter them. The size of the string will shrink with each removal. Without adjusting the size, you'll have access violations when you attempt to read beyond the new, smaller size, if a removal occurred.


  for(unsigned int i = 0; i < length; i++) {
    char currChar = str[i]; //holds current character
    for(unsigned int j = i+1; j < length; j++) {
      if(currChar == str[j])
        str.erase (std::remove(str.begin()+j, str.end(), str[j]), str.end());

You seem to have a misunderstanding of how std::remove works. std::remove shifts all values that don't match the value forward. In your code, you remove the duplicates when you encounter the first one, but you keep searching for duplicates (that don't exist) instead of moving on.


The standard library provides std::remove_if, which is a function that will remove elements based on a condition. The remove functions are designed to do the looping while you provide it with either a value or predicate.

Poorly-chosen names can mislead the reader and cause bugs. It is important to use descriptive names that match the semantics and role of the underlying entities, within reason. Are you removing elements (shifting elements forward with unspecified elements in the removal area)? Are you erasing elements (removing elements the destructing the unspecified elements)?

auto remove_duplicates(std::string& str) {
  return std::remove_if(str.begin(), str.end(), predicate);
}

std::string& erase_duplicates(std::string& str) {
  str.erase(remove_duplicates(str), str.end());
  return str;
}

As for the predicate, others have suggested using an array lookup scheme, but that violates the "no additional buffer" requirement of the problem. Another way is to keep track of the area that has already been processed as deduplicated (similar to insertion sort working with its processed range).

auto remove_duplicates(std::string& str) {
  auto deduplicated_end = str.begin();
  auto is_duplicate = [&](char ch) {
    auto found = std::find(str.begin(), deduplicated_end, ch) != deduplicated_end;
    if (!found)
      ++deduplicated_end;
    return found;
  };
  return std::remove_if(str.begin(), str.end(), is_duplicate);
}
Source Link
Snowhawk
  • 6.8k
  • 1
  • 19
  • 37

    int length = str.length();

Whenever you cache results like this, you are telling the reader that calculating the length may be expensive or that the size will not change. That doesn't hold true when you erase elements as you encounter them. The size of the string will shrink with each removal. Without adjusting the size, you'll have access violations when you attempt to read beyond the new, smaller size, if a removal occurred.


  for(unsigned int i = 0; i < length; i++) {
    char currChar = str[i]; //holds current character
    for(unsigned int j = i+1; j < length; j++) {
      if(currChar == str[j])
        str.erase (std::remove(str.begin()+j, str.end(), str[j]), str.end());

You seem to have a misunderstanding of how std::remove works. std::remove shifts all values that don't match the value forward. In your code, you remove the duplicates when you encounter the first one, but you keep searching for duplicates (that don't exist) instead of moving on.


The standard library provides std::remove_if, which is a function that will remove elements based on a condition. The remove functions are designed to do the looping while you provide it with either a value or predicate.

Poorly-chosen names can mislead the reader and cause bugs. It is important to use descriptive names that match the semantics and role of the underlying entities, within reason. Are you removing elements (shifting elements forward with unspecified elements in the removal area)? Are you erasing elements (removing elements the destructing the unspecified elements)?

auto remove_duplicates(std::string& str) {
  return std::remove_if(str.begin(), str.end(), predicate);
}

std::string& erase_duplicates(std::string& str) {
  str.erase(remove_duplicates(str), str.end());
  return str;
}

As for the predicate, others have suggested using an array lookup scheme, but that violates the "no additional buffer" requirement of the problem. Another way is to keep track of the area that has already been processed as deduplicated (similar to insertion sort).

auto remove_duplicates(std::string& str) {
  auto deduplicated_end = str.begin();
  auto is_duplicate = [&](char ch) {
    auto found = std::find(str.begin(), deduplicated_end, ch) != deduplicated_end;
    if (!found)
      ++deduplicated_end;
    return found;
  };
  return std::remove_if(str.begin(), str.end(), is_duplicate);
}