You can deal with this by inspecting the errors produced with BulkWriteError. This is actually an "object" which has several properties. The interesting parts are in details:
import pymongo
from bson.json_util import dumps
from pymongo import MongoClient
client = MongoClient()
db = client.test
collection = db.duptest
docs = [{ '_id': 1 }, { '_id': 1 },{ '_id': 2 }]
try:
result = collection.insert_many(docs,ordered=False)
except pymongo.errors.BulkWriteError as e:
print e.details['writeErrors']
On a first run, this will give the list of errors under e.details['writeErrors']:
[
{
'index': 1,
'code': 11000,
'errmsg': u'E11000 duplicate key error collection: test.duptest index: _id_ dup key: { : 1 }',
'op': {'_id': 1}
}
]
On a second run, you see three errors because all items existed:
[
{
"index": 0,
"code": 11000,
"errmsg": "E11000 duplicate key error collection: test.duptest index: _id_ dup key: { : 1 }",
"op": {"_id": 1}
},
{
"index": 1,
"code": 11000,
"errmsg": "E11000 duplicate key error collection: test.duptest index: _id_ dup key: { : 1 }",
"op": {"_id": 1}
},
{
"index": 2,
"code": 11000,
"errmsg": "E11000 duplicate key error collection: test.duptest index: _id_ dup key: { : 2 }",
"op": {"_id": 2}
}
]
So all you need do is filter the array for entries with "code": 11000 and then only "panic" when something else is in there
panic = filter(lambda x: x['code'] != 11000, e.details['writeErrors'])
if len(panic) > 0:
print "really panic"
That gives you a mechanism for ignoring the duplicate key errors but of course paying attention to something that is actually a problem.
ordered=FalseBulk "inserts" still throw errors, even though the whole batch actually commits. The option is up to you whether totry .. exceptand essentially "ignore" the duplicate key error, or if you really don't want to like with that, then use "upserts" instead. That does require what is effectively a "find" on each document, but by nature is "cannot" create a duplicate key. It's just how it works.BuklWriteErroror whatever the particular class is in python ( need to look that up ) with list each error in an array. Those entries have an error code whichE11000off the top of my head. Simply process and ignore those, and of course really "thow/complain/log/whatever" on any other code present.