This describes how to run MongoDB on Amazon Web Services, in particular using Amazon Elastic Compute Cloud (EC2) to support a MongoDB replica set.
Choice of machine image
A 64bit machine is needed if a mongod is to manage more than 2 GB of data. It will also need medium or high IO performance which indicates a Large instance as the minimum realistic starting point. The 850 GB of instance storage would be used for the MongoDB data storage. This will cost $0.34 per hour (pricing for US East region) so a replica set with two machines (one master and one slave) will cost $0.68 an hour or about $5,590 a year. That’s the on-demand pricing, because we want to run 24×7 reserved instances may be cheaper. Using reserved instances with a one year term in the same region is $0.12 per hour with a $910 down payment each giving a total of about $3,922.
Location
For reliability in the event of the failure of an availability zone there should be a second slave in a different availability zone in the same region or even in a different region. This will incur data transfer charges between the master and the second slave, much cheaper ($0.01 per GB transferred) between availability zones in the same region than between regions ($0.10 per GB transferred in plus between from $0.15 per GB transferred out).
If the load is low it would be nice to be able to use the machines for other purposes as well, for instance primary web server with slave MongoDB server and backup web server with master MongoDB server.
Experimental
To start with I want to experiment with using Micro instances, even though these have slower EBS storage, to see what the throughput is.
Cookbook
This describes the actual steps I took to get a MongoDB replica set working using manual commands. The next post will build on this to show how to automate the process of starting and stopping a replica set. You have to have automation to work successfully with cloud services.
Security Group
Create a security group to restrict connections into the instances running MongoDB. Both master and slave instances will use this same group because a slave could need to take over from a master so no distinction should exist between them.
#
ec2-authorize MongoDB -P tcp -p 22
ec2-authorize MongoDB -P tcp -p 27017 -u 844613644011 -o MongoDB
ec2-authorize MongoDB -P tcp -p 27017 -u 844613644011 -o AppServer
ec2-authorize MongoDB -P tcp -p 27017 -s 24.153.207.123/32
Install MongoDB
Create a volume for the MongoDB executables. This isn’t going to hold the data so it doesn’t have to be very big. Create the smallest one we can, which is 1GB. Record the volume it.
#
ec2-create-volume --size 1 --availability-zone us-east-1a
Start a micro instance in same availability zone. KeyPair20110224 is the key I’m going to use to log on with SSH later.
#
ec2-run-instances ami-74f0061d -g MongoDB -k KeyPair20110224 -t t1.micro --availability-zone us-east-1a
Wait till it’s running.
#
ec2-describe-instances
Attach the new volume to the instance. Instance id from the output of ec2-describe-instances above, volume id from when the volume was created. You could also use ec2-describe-volumes.
#
ec2-attach-volume vol-77815b1c --instance i-73c2221d --device /dev/sdf
Logon to the instance. The address comes from ec2-describe-instances output.
#
ssh -i keys/KeyPair20110224.pem ec2-user@ec2-174-129-155-243.compute-1.amazonaws.com
Format the volume and mount it.
#
sudo mkfs -t ext3 /dev/sdf
sudo mkdir /mongodb
sudo mount /dev/sdf /mongodb
Download and unpack mongodb package onto the volume we mounted. Using the 64 bit linux package.
#
curl http://fastdl.mongodb.org/linux/mongodb-linux-x86_64-1.8.1.tgz > mongodb.tgz
sudo mv mongodb.tgz /mongodb/
cd /mongodb
sudo tar tzf mongodb.tgz
sudo rm mongodb.tgz
Create a data directory under the ec2-user home directory.
#
cd
mkdir -p data/db
Start mongod.
#
cd /mongodb/mongodb-linux-x86_64-1.8.1
bin/mongod --dbpath /home/ec2-user/data/db --logpath /home/ec2-user/mongodb.log --fork
Test that mongod is working and can be reached locally using the shell. Start mongo and try some commands.
#
bin/mongo
MongoDB shell version: 1.8.1
connecting to: test
> db
test
> post = {"title" : "My Test Post", "content" : "Some stuff I wrote"}
{ "title" : "My Test Post", "content" : "Some stuff I wrote" }
> db.blog.insert(post)
> db.blog.find()
{ "_id" : ObjectId("4dcc9ed4052e568e1f58cb00"), "title" : "My Test Post", "content" : "Some stuff I wrote" }
> exit
bye
Stop mongod. Look in the log to get the process id or record it when the instance starts.
#
head /home/ec2-user/mongodb.log
sudo kill -2 1032
tail /home/ec2-user/mongodb.log
Create a snapshot
Unmount the EBS volume. This is so we can take a snapshot to use to create volumes when running instances.
#
cd
sudo umount -d /mongodb
Logoff the instance and detach the volume. Use ec2-describe-volumes to check that the detach has completed.
#
ec2-detach-volume vol-77815b1c
ec2-describe-volumes vol-77815b1c
Terminate the instance.
#
ec2-terminate-instances i-73c2221d
Take a snapshot of the volume. This is so we can launch other instances using the snapshot to create and attach a volume.
#
ec2-create-snapshot vol-77815b1c -d 'MongoDB installation'
Test the snapshot
Create a CloudInit configuration file called MongoDBInit.txt to start an instance, contents are shown below. For this experiment I’ll use the space on the default drive associated with the instance to store the data. This configuration mounts the volume with the mongodb executables, creates the data directory and starts mongod.
#cloud-config
mounts:
- [ /dev/sdf, /mongodb, “auto”, “defaults”, “0”, “2” ]
runcmd:
- cd /home/ec2-user
- mkdir -p data/db
- cd /mongodb/mongodb-linux-x86_64-1.8.1
- bin/mongod --dbpath /home/ec2-user/data/db --logpath /home/ec2-user/mongodb.log --nohttpinterface --fork
Start an ec2 micro instance using the snapshot and CloudInit configuration file. This is to let us test that the automatic mounting works and the installed mongod will run successfully.
#
ec2-run-instances ami-74f0061d -g MongoDB -k KeyPair20110224 -t t1.micro --availability-zone us-east-1a -b "/dev/sdf=snap-a9c5e0c6::true" --user-data-file MongoDBInit.txt
Wait till the new instance is running
#
ec2-describe-instances
Check that it’s possible to connect to mongod on the new instance using the mongo shell from your local machine. Address of the machine comes from ec2-describe-instances.
#
mongodb-osx-x86_64-1.8.1/bin/mongo ec2-67-202-13-208.compute-1.amazonaws.com
Terminate the instance.
#
ec2-terminate-instance i-8fc02be1
Modify CloudInit configuration to support replica sets
To create a replica set we need to know the ip addresses of the machines in the set so we’ll have to start the instances and then use ssh to start mongod. The commands run by ssh will be run as ec2-user so the directories that MongoDBInit.txt creates will have to have their owner changed. Modify MongoDB.init.txt by removing the last two lines so that mongod isn’t automatically started and adding “chown -R ec2-user data” as the last line.
Start another instance with the new MongoDBInit.txt
#
ec2-run-instances ami-74f0061d -g MongoDB -k KeyPair20110224 -t t1.micro --availability-zone us-east-1a -b "/dev/sdf=snap-a9c5e0c6::true" --user-data-file MongoDBInit.txt
Use SSH to start mongod.
#
ssh -i keys/KeyPair20110224.pem ec2-user@ec2-204-236-247-165.compute-1.amazonaws.com "/mongodb/mongodb-linux-x86_64-1.8.1/bin/mongod --dbpath /home/ec2-user/data/db --logpath /home/ec2-user/mongodb.log --nohttpinterface --fork"
Test that mongod is running by reaching it from the admin machine.
#
mongodb-osx-x86_64-1.8.1/bin/mongo ec2-204-236-247-165.compute-1.amazonaws.com
Shutdown mongod using ssh and the process id reported when it was started with ssh.
#
ssh -i keys/KeyPari20110224.pem ec2-user@ec2-204-236-247-165.compute-1.amazonaws.com “kill 1049”
Terminate the instance.
#
ec2-terminate-instances i-8fc02be1
Starting a replica set.
Start two instances to host the two servers of the replica set.
#
ec2-run-instances -n 2 ami-74f0061d -g MongoDB -k KeyPair20110224 -t t1.micro --availability-zone us-east-1a -b "/dev/sdf=snap-a9c5e0c6::true" --user-data-file MongoDBInit.txt
Start mongodb on both machines. Use the internal ip addresses to communicate between the instances. So the first command tells machine one to talk to machine two and the second tells two to talk to one. The replica set is called logSet.
#
ssh -i keys/KeyPair20110224.pem ec2-user@ec2-50-17-110-242.compute-1.amazonaws.com "/mongodb/mongodb-linux-x86_64-1.8.1/bin/mongod --dbpath /home/ec2-user/data/db --logpath /home/ec2-user/mongodb.log --nohttpinterface --fork --replSet logSet/domU-12-31-39-04-0C-AF.compute-1.internal"
ssh -i keys/KeyPair20110224.pem ec2-user@ec2-50-19-70-14.compute-1.amazonaws.com "/mongodb/mongodb-linux-x86_64-1.8.1/bin/mongod --dbpath /home/ec2-user/data/db --logpath /home/ec2-user/mongodb.log --nohttpinterface --fork --replSet logSet/domU-12-31-39-09-88-8B.compute-1.internal"
Use mongo to configure the repl set. Until this is done the replica set isn’t created. Connect to one of the two instances. The internal ip addresses of the two instances are used in the configuration. It takes a short while before the set is configured and the response appears.
#
mongodb-osx-x86_64-1.8.1/bin/mongo ec2-50-19-70-14.compute-1.amazonaws.com/admin
MongoDB shell version: 1.8.1
connecting to: ec2-50-19-70-14.compute-1.amazonaws.com/admin
> db.runCommand({"replSetInitiate" : {
... "_id" : "logSet",
... "members" : [
... {
... "_id" : 1,
... "host" : "domU-12-31-39-04-0C-AF.compute-1.internal"
... },
... {
... "_id" : 2,
... "host" : "domU-12-31-39-09-88-8B.compute-1.internal"
... }
... ]}})
{
"info" : "Config now saved locally. Should come online in about a minute.",
"ok" : 1
}
Look at the mongodb.log files on the two machines. You can see which has become the primary and which the secondary. Also, when you connect to an instance you can see if it is the primary or not.
#
mongodb-osx-x86_64-1.8.1/bin/mongo ec2-50-19-70-14.compute-1.amazonaws.com/adminMongoDB shell version: 1.8.1
connecting to: ec2-50-19-70-14.compute-1.amazonaws.com/admin
logSet:PRIMARY>
Now shutdown the slave and then the master using the process numbers we recorded when we started them.
#
ssh -i keys/KeyPair20110224.pem ec2-user@ec2-50-17-110-242.compute-1.amazonaws.com "kill 1044"
ssh -i keys/KeyPair20110224.pem ec2-user@ec2-50-19-70-14.compute-1.amazonaws.com "kill 1065"
Terminate the instances
#
ec2-terminate-instances i-bb7780d5 i-b97780d7
Clearly executing all of these commands manually isn’t going to work for more than this initial setup, it’s just too complex. The next step is to try and automate the process of bringing up and shutting down a replica set.