Monthly Archives: March 2014

Sharding in mongodb

Just putting together, what we have done in sharding so that we dont forget it later. I shall update this doc as and when we have more details.

Sharding means, distributing data across multiple servers.Basically, mongodb sharding requires three things

1) a config server which stores the metadata which knows where the data resides

2) a query router server is the server to which the application actually communicates. It contacts the config servers to find in which shard the data resides and retrieves the data to the application.

3) shard servers – this consists of a subset of the entire data, distributed across multiple servers

In our case, for test purpose, we used the minimal number of servers. Mongod eats up ram hence, it will be good to use servers with somewhat good specifications. We used

1) 1 config servers

2) 1 query router server. We combined config server and query router server into one, hence these two required only 1 server.

3) In order to see how sharding actually works, we needed 2 shard servers. So a total of 3 servers

Install mongo in all servers as mentioned in http://greproot.com/install-mongodb-centos/

Setting up config server

————————–

Hostname of server chosen as config.mongotest.com

mkdir /mongo-metadata     – create a folder for the mongo metadata

Now start mongo config server as follows. Make sure to use the port as 27019. Whatever number of config servers you use, you need to make sure the path and port are same for all.

mongod –configsvr –fork –logpath=/var/log/mongo/mongod.log  –dbpath /mongo-metadata –port 27019

Setting up query router

—————————-

Please note I chose config and query router servers to be same. If you have an alternate server, use it as queryrouter server. Query router use the mongos service. Mongos runs on port 27017.

mkdir /queryrouter_log

Start mongos as follows.

mongos –fork –logpath /queryrouter_log/query.log –configdb config.mongotest.com:27019

Shard Servers

—————–

Hostnames chosen are shard1.mongotest.com and shard2.mongotest.com. Just start mongodb in both servers and it will run on port 27017

We dont have to setup shard servers separately.  Just login to any one shard server and you can setup all shards from there itself.

Login to any shard server as root. Connect to the query router server from there as follows.

mongo –host config.mongotest.com –port 27017

above command connects to the mongo shell of queryrouter server, at port 27017 which runs mongos

mongo –host config.mongotest.com –port 27017

MongoDB shell version: 2.4.9
connecting to: config.mongotest.com:27017/test

Add the two shard servers first
mongos> sh.addShard( “shard1.mongotest.com:27017″ )
mongos> sh.addShard( “shard2.mongotest.com:27017″ )

Create a new database

mongos> use divya_test
switched to db divya_test
Enable Sharding for that db

mongos> sh.enableSharding(“divya_test”)
{ “ok” : 1 }

Create a new collection test with an index _id

mongos> db.test.ensureIndex( { _id : “hashed” } )

Now shard this collection using a hashed shard key(i am not very sure of how shard keys has to be selected)

sh.shardCollection(“divya_test.test”, { “_id”: “hashed” } )

 

You can see the status of the shards by issuing the following command

mongos>sh.status()

Now, try adding some data to the collection and check both shard servers. You will see the data is spread across those servers.

mongos> db.test.save({_id:1})
mongos> db.test.save({_id:2})
mongos> db.test.save({_id:3})
mongos> db.test.find()
{ “_id” : 1 }
{ “_id” : 2 }
{ “_id” : 3 }