2

I've seen a lot of variations of this question from things as simple as remove duplicates to finding and listing duplicates. Even trying to take bits and pieces of these examples does not get me my result.

My question is how am I able to check if my list has a duplicate entry? Even better, does my list have a non-zero duplicate?

I've had a few ideas -

#empty list
myList = [None] * 9 

#all the elements in this list are None

#fill part of the list with some values
myList[0] = 1
myList[3] = 2
myList[4] = 2
myList[5] = 4
myList[7] = 3

#coming from C, I attempt to use a nested for loop
j = 0
k = 0
for j in range(len(myList)):
    for k in range(len(myList)):
        if myList[j] == myList[k]:
            print "found a duplicate!"
            return

If this worked, it would find the duplicate (None) in the list. Is there a way to ignore the None or 0 case? I do not care if two elements are 0.

Another solution I thought of was turn the list into a set and compare the lengths of the set and list to determine if there is a duplicate but when running set(myList) it not only removes duplicates, it orders it as well. I could have separate copies, but it seems redundant.

7
  • 1
    You're on the right track! I would definitely recommend the set operation, as it's a single function call that gets exactly what you need; you can then pop out the Nones and 0s from your final set.
    – ericmjl
    Commented Jan 28, 2015 at 21:27
  • Regular sets in Python lack a defined order, but there is always OrderedDict Commented Jan 28, 2015 at 21:27
  • 3
    return outside of function = syntax error Commented Jan 28, 2015 at 21:30
  • You'd also want to check that you aren't comparing an index against itself and also that you don't derive 2 duplicates for both times you compare the indexes (e.g. dont count 2 duplicates because 3 is a duplicate of 4 but 4 is also a duplicate of 3) Commented Jan 28, 2015 at 21:32
  • 1
    "Is there a way to ignore the None or 0 case?" Sure: if myList[i] is None or myList[i] == 0: continue
    – Jasper
    Commented Jan 28, 2015 at 21:37

7 Answers 7

3

If you simply want to check if it contains duplicates. Once the function finds an element that occurs more than once, it returns as a duplicate.

my_list = [1, 2, 2, 3, 4]

def check_list(arg):
    for i in arg:
        if arg.count(i) > 1:
            return 'Duplicate'

print check_list(my_list) == 'Duplicate' # prints True
2

Try changing the actual comparison line to this:

if myList[j] == myList[k] and not myList[j] in [None, 0]:
2

I'm not certain if you are trying to ascertain whether or a duplicate exists, or identify the items that are duplicated (if any). Here is a Counter-based solution for the latter:

# Python 2.7
from collections import Counter

#
# Rest of your code
#

counter = Counter(myList)
dupes = [key for (key, value) in counter.iteritems() if value > 1 and key]
print dupes

The Counter object will automatically count occurances for each item in your iterable list. The list comprehension that builds dupes essentially filters out all items appearing only once, and also upon items whose boolean evaluation are False (this would filter out both 0 and None).

If your purpose is only to identify that duplication has taken place (without enumerating which items were duplicated), you could use the same method and test dupes:

if dupes:  print "Something in the list is duplicated"
1

To remove dups and keep order ignoring 0 and None, if you have other falsey values that you want to keep you will need to specify is not None and not 0:

print [ele for ind, ele in enumerate(lst[:-1]) if ele not in lst[:ind] or not ele] 

If you just want the first dup:

for ind, ele in enumerate(lst[:-1]):
    if ele in lst[ind+1:] and ele:
        print(ele)
        break

Or store seen in a set:

seen = set()
for  ele in lst:
    if ele in seen:
        print(ele)
        break
    if ele:
        seen.add(ele) 
2
  • Thank you, but what if I wanted the second duplicate?
    – Mike Issa
    Commented Aug 3, 2017 at 23:19
  • 2
    @MikeIssa, so the second occurrence? It would only really make sense of you wanted the index or to reorder based on when the second appeared, if you had a concrete example, it would be easy to implement, with a collections.Counter Commented Aug 4, 2017 at 18:52
0

You can use collections.defaultdict and specify a condition, such as non-zero / Truthy, and specify a threshold. If the count for a particular value exceeds the threshold, the function will return that value. If no such value exists, the function returns False.

from collections import defaultdict

def check_duplicates(it, condition, thresh):
    dd = defaultdict(int)
    for value in it:
        dd[value] += 1
        if condition(value) and dd[value] > thresh:
            return value
    return False

L = [1, None, None, 2, 2, 4, None, 3, None]

res = check_duplicates(L, condition=bool, thresh=1)  # 2

Note in the above example the function bool will not consider 0 or None for threshold breaches. You could also use, for example, lambda x: x != 1 to exclude values equal to 1.

-2

Here's a bit of code that will show you how to remove None and 0 from the sets.

l1 = [0, 1, 1, 2, 4, 7, None, None]

l2 = set(l1)
l2.remove(None)
l2.remove(0)
1
  • How does this tell you if any of the truthy elements are duplicated? Commented Jan 23, 2019 at 1:50
-2

In my opinion, this is the simplest solution I could come up with. this should work with any list. The only downside is that it does not count the number of duplicates, but instead just returns True or False

for k, j in mylist:
    return k == j
1
  • 2
    Please consider explaining your answer.
    – jpp
    Commented Jan 5, 2019 at 0:29

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.