0

i'm writing a script that puts a large number of xml files into mongodb, thus when i execute the script multiple times the same object is added many times to the same collection.

I checked out for a way to stop this behavior by checkinng the existance of the object before adding it, but can't find a way.

help!

2 Answers 2

1

The term for the operation you're describing is "upsert".

In mongodb, the way to upsert is to use the update functionality with upsert=True.

Sign up to request clarification or add additional context in comments.

Comments

0

You can index on one or more fields(not _id) of the document/xml structure. Then make use of find operator to check if a document containing that indexed_field:value is present in the collection. If it returns nothing then you can insert new documents into your collection. This will ensure only new docs are inserted when you re-run the script.

2 Comments

actualy i found out something like db.collection.find(object).limit(1) returns the first object it finds matching "object", i can't use $exist like that : db.collection.find(o, {$exist:true}), can i?
Apologies for the confusion caused, "$exists" does not make sens in this scenario. Just use "find" but on an indexed field so that the query is faster.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.