Skip to main content
deleted 37 characters in body; edited tags; edited title
Source Link
αғsнιη
  • 41.9k
  • 17
  • 75
  • 118

Find and Replace Duplicates in csv

I have a csv file with concatenated emails in it that looks like the following:

id  emails
1   [email protected]
2   [email protected]
3   [email protected],[email protected],[email protected]

Each row only has distinct emails, but there might be duplicates from one row to another, as seen above in row 1 and row 3. I need to remove duplicates from the file so that the file looks like the following:

id  emails
1   [email protected]
2   [email protected]
3   [email protected],[email protected]

This means I need to check each row against all of the rows that follow it. This isn't feasible to do with any kind of iterative script given the amount of data I have. I feel like there is a simple (or at least viable) way to accomplish this with awk or sed but I haven't found any yet. Any help is greatly appreciated!

Find and Replace Duplicates in csv

I have a csv file with concatenated emails in it that looks like the following:

id  emails
1   [email protected]
2   [email protected]
3   [email protected],[email protected],[email protected]

Each row only has distinct emails, but there might be duplicates from one row to another, as seen above in row 1 and row 3. I need to remove duplicates from the file so that the file looks like the following:

id  emails
1   [email protected]
2   [email protected]
3   [email protected],[email protected]

This means I need to check each row against all of the rows that follow it. This isn't feasible to do with any kind of iterative script given the amount of data I have. I feel like there is a simple (or at least viable) way to accomplish this with awk or sed but I haven't found any yet. Any help is greatly appreciated!

Find and Replace Duplicates

I have a file with concatenated emails in it that looks like the following:

id  emails
1   [email protected]
2   [email protected]
3   [email protected],[email protected],[email protected]

Each row only has distinct emails, but there might be duplicates from one row to another, as seen above in row 1 and row 3. I need to remove duplicates from the file so that the file looks like the following:

id  emails
1   [email protected]
2   [email protected]
3   [email protected],[email protected]

This means I need to check each row against all of the rows that follow it. This isn't feasible to do with any kind of iterative script given the amount of data I have. I feel like there is a simple (or at least viable) way to accomplish this with awk or sed but I haven't found any yet.

Source Link

Find and Replace Duplicates in csv

I have a csv file with concatenated emails in it that looks like the following:

id  emails
1   [email protected]
2   [email protected]
3   [email protected],[email protected],[email protected]

Each row only has distinct emails, but there might be duplicates from one row to another, as seen above in row 1 and row 3. I need to remove duplicates from the file so that the file looks like the following:

id  emails
1   [email protected]
2   [email protected]
3   [email protected],[email protected]

This means I need to check each row against all of the rows that follow it. This isn't feasible to do with any kind of iterative script given the amount of data I have. I feel like there is a simple (or at least viable) way to accomplish this with awk or sed but I haven't found any yet. Any help is greatly appreciated!