mongoimport


mongoimport very very slow for large json file


I have a large json file (350GB) and I am trying to import it in MongoDB collection using mongoimport.The mongoimport is very very slow and I am not sure how many days it will take.
Can any one please suggest the best way to load this json file to mongodb collection. I have enough disk space to load this json file.
I came across similar situation. I used mongorestore instead of mongoimport but the idea is the same. iotop shows that the restore process had an IO rate of about 1M/s which is pretty low. As other post here suggests, the low performance is probably due to the json to bson serialization. So I ended splitting up the exported json file into different chunks with the following command
mongodump --host < host > --port < port > --username < user > --password < pwd > --authenticateionDatabase admin --db < db > --collection < coll > --query "{DayOfWeek:"Monday"}" --out "SomeDir-Monday" &
mongodump --host < host > --port < port > --username < user > --password < pwd > --authenticateionDatabase admin --db < db > --collection < coll > --query "{DayOfWeek:"Tuesday"}" --out "SomeDir-Tuesday" &
...
then I ended up having 7 chunks.
Finally import these chunks in parallel using mongorestore with following command.
mongorestore --host < host > --port < port > --username < user > --password < pwd > --authenticateionDatabase admin --db < db > --collection < coll > PATH_TO_MONDAY.json &
mongorestore --host < host > --port < port > --username < user > --password < pwd > --authenticateionDatabase admin --db < db > --collection < coll > PATH_TO_TUESDAY.json &
...
If you are using mongodb > 3.0.0 you can use the --numInsertionWorkers on the mongoimport command
Set this to the number of CPUs you have in order to speedup the import.
ref.

Related Links

mongoimport csv file error : fields cannot be identical '' and ''
FailedToParse: Date expecting integer milliseconds
mongoimport very very slow for large json file
How to import a json file to MongoDB
Mongoimport in Mongodb

Categories

HOME
repository
youtube-api
typo3
okhttp3
ubuntu-14.04
procmon
osx-yosemite
kairosdb
eclipse-orion
k-means
android-actionbar
virtuemart
alert
project-structure
apdu
aws-certificate-manager
sqldependency
2checkout
foxpro
lines
exacttarget
node-soap
android-scrollview
dropdownbox
mootools
incapsula
cache-control
ragel
mapquest
origami
lfe
julia-jump
keychain
icu
redbean
head
inner-join
mef
serverless-architecture
pyopenssl
noise
px4
r-commander
selection-sort
pyautogui
sqlproj
lattice
stringbuilder
spring-social-twitter
virtualenvwrapper
angularfire
queuing
stdlist
videoview
ewam
dynamics-nav
skscene
android-facebook
producer-consumer
play-json
log4cxx
firebase-tools
cl.exe
system.net
oxygenxml
parallel-port
xpand
isl
autofilter
ekevent
cab
kademlia
microsoft-speech-platform
database-restore
inflate
m3u
esi
big-endian
commonsware
netduino
silverlight-2.0
script-tag
projectgen
html-help-workshop
cons
stackless

Resources

Encrypt Message