check for duplicates in a python list

Question

I've seen a lot of variations of this question from things as simple as remove duplicates to finding and listing duplicates. Even trying to take bits and pieces of these examples does not get me my result.

My question is how am I able to check if my list has a duplicate entry? Even better, does my list have a non-zero duplicate?

I've had a few ideas -

#empty list
myList = [None] * 9 

#all the elements in this list are None

#fill part of the list with some values
myList[0] = 1
myList[3] = 2
myList[4] = 2
myList[5] = 4
myList[7] = 3

#coming from C, I attempt to use a nested for loop
j = 0
k = 0
for j in range(len(myList)):
    for k in range(len(myList)):
        if myList[j] == myList[k]:
            print "found a duplicate!"
            return

If this worked, it would find the duplicate (None) in the list. Is there a way to ignore the None or 0 case? I do not care if two elements are 0.

Another solution I thought of was turn the list into a set and compare the lengths of the set and list to determine if there is a duplicate but when running set(myList) it not only removes duplicates, it orders it as well. I could have separate copies, but it seems redundant.

You're on the right track! I would definitely recommend the set operation, as it's a single function call that gets exactly what you need; you can then pop out the Nones and 0s from your final set. — ericmjl, Commented Jan 28, 2015 at 21:27
Regular sets in Python lack a defined order, but there is always OrderedDict — Paulo Scardine, Commented Jan 28, 2015 at 21:27
You'd also want to check that you aren't comparing an index against itself and also that you don't derive 2 duplicates for both times you compare the indexes (e.g. dont count 2 duplicates because 3 is a duplicate of 4 but 4 is also a duplicate of 3) — Paul Rooney, Commented Jan 28, 2015 at 21:32
"Is there a way to ignore the None or 0 case?" Sure: if myList[i] is None or myList[i] == 0: continue — Jasper, Commented Jan 28, 2015 at 21:37

Malik Brahimi · Accepted Answer · 2015-01-28 21:35:24Z

3

If you simply want to check if it contains duplicates. Once the function finds an element that occurs more than once, it returns as a duplicate.

my_list = [1, 2, 2, 3, 4]

def check_list(arg):
    for i in arg:
        if arg.count(i) > 1:
            return 'Duplicate'

print check_list(my_list) == 'Duplicate' # prints True

edited Jan 28, 2015 at 21:35

answered Jan 28, 2015 at 21:29

Malik Brahimi

16.7k7 gold badges46 silver badges75 bronze badges

Add a comment |

paolo · Accepted Answer · 2015-01-28 21:27:25Z

2

Try changing the actual comparison line to this:

if myList[j] == myList[k] and not myList[j] in [None, 0]:

answered Jan 28, 2015 at 21:27

paolo

2,5383 gold badges18 silver badges25 bronze badges

Add a comment |

rchang · Accepted Answer · 2015-01-28 21:29:10Z

I'm not certain if you are trying to ascertain whether or a duplicate exists, or identify the items that are duplicated (if any). Here is a Counter-based solution for the latter:

# Python 2.7
from collections import Counter

#
# Rest of your code
#

counter = Counter(myList)
dupes = [key for (key, value) in counter.iteritems() if value > 1 and key]
print dupes

The Counter object will automatically count occurances for each item in your iterable list. The list comprehension that builds dupes essentially filters out all items appearing only once, and also upon items whose boolean evaluation are False (this would filter out both 0 and None).

If your purpose is only to identify that duplication has taken place (without enumerating which items were duplicated), you could use the same method and test dupes:

if dupes:  print "Something in the list is duplicated"

Padraic Cunningham · Accepted Answer · 2015-01-28 22:43:09Z

1

To remove dups and keep order ignoring 0 and None, if you have other falsey values that you want to keep you will need to specify is not None and not 0:

print [ele for ind, ele in enumerate(lst[:-1]) if ele not in lst[:ind] or not ele]

If you just want the first dup:

for ind, ele in enumerate(lst[:-1]):
    if ele in lst[ind+1:] and ele:
        print(ele)
        break

Or store seen in a set:

seen = set()
for  ele in lst:
    if ele in seen:
        print(ele)
        break
    if ele:
        seen.add(ele)

edited Jan 28, 2015 at 22:43

answered Jan 28, 2015 at 21:47

Padraic Cunningham

181k30 gold badges263 silver badges325 bronze badges

Thank you, but what if I wanted the second duplicate?
– Mike Issa
Commented Aug 3, 2017 at 23:19
2

@MikeIssa, so the second occurrence? It would only really make sense of you wanted the index or to reorder based on when the second appeared, if you had a concrete example, it would be easy to implement, with a collections.Counter
– Padraic Cunningham
Commented Aug 4, 2017 at 18:52

Add a comment |

jpp · Accepted Answer · 2019-01-05 00:47:48Z

You can use collections.defaultdict and specify a condition, such as non-zero / Truthy, and specify a threshold. If the count for a particular value exceeds the threshold, the function will return that value. If no such value exists, the function returns False.

from collections import defaultdict

def check_duplicates(it, condition, thresh):
    dd = defaultdict(int)
    for value in it:
        dd[value] += 1
        if condition(value) and dd[value] > thresh:
            return value
    return False

L = [1, None, None, 2, 2, 4, None, 3, None]

res = check_duplicates(L, condition=bool, thresh=1)  # 2

Note in the above example the function bool will not consider 0 or None for threshold breaches. You could also use, for example, lambda x: x != 1 to exclude values equal to 1.

ericmjl · Accepted Answer · 2015-01-28 21:28:54Z

-2

Here's a bit of code that will show you how to remove None and 0 from the sets.

l1 = [0, 1, 1, 2, 4, 7, None, None]

l2 = set(l1)
l2.remove(None)
l2.remove(0)

answered Jan 28, 2015 at 21:28

ericmjl

14.8k13 gold badges56 silver badges83 bronze badges

How does this tell you if any of the truthy elements are duplicated?
– Mad Physicist
Commented Jan 23, 2019 at 1:50

Add a comment |

CodedCuber · Accepted Answer · 2019-01-23 01:44:58Z

-2

In my opinion, this is the simplest solution I could come up with. this should work with any list. The only downside is that it does not count the number of duplicates, but instead just returns True or False

for k, j in mylist:
    return k == j

edited Jan 23, 2019 at 1:44

answered Jan 5, 2019 at 0:24

CodedCuber

11 silver badge3 bronze badges

2

Please consider explaining your answer.
– jpp
Commented Jan 5, 2019 at 0:29

Add a comment |

Collectives™ on Stack Overflow

check for duplicates in a python list

7 Answers 7

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

Linked

Related