Aug 3, 2013

Node.js + MongoDB, part 2: here comes memcached!

Let's elaborate on the Node.js + MongoDB app. I lied: it's not really worth $1 billion... yet. It surely will once we've added memcached to the mix ;)

Once again, we'll use a couple of EC2 instances running Ubuntu 12.04: one for the Node.js web server and one for the memcached server. MongoDB will still be served from MongoLab.

Start your instances and let's configure our memcached server first. Since it requires port 11211 to be open, we have to add a couple of rules for 11211/UDP and 11211/TCP in the security group attached to the instance.

Then, let's install memcached:
ubuntu@ip-10-234-177-74:~$ sudo apt-get install memcached

We also need to edit /etc/memcached.conf in order to set the '-l' parameter to the correct IP address (running ifconfig eth0 will confirm the right one to use). In my case, it is

Then, we need to restart memcached:

ubuntu@ip-10-234-177-74:~$ sudo service memcached restart

Now, let's go to the Node.js instance and check that we can access the memcached server:
ubuntu@ip-10-48-161-115:~$ echo stats|nc 11211

If you see a lot of stats like I do, you're good to go. If not, double-check the steps above (rules, config file, restart).

Now, let's install the memcached client for Node.js with npm, the Node.js package manager. There are several clients out there, mc looks pretty good and well-maintained :)

ubuntu@ip-10-48-161-115:~$ npm install mc

That's it. Now, let's write some code, yeah! Here's the idea:
  • call the web server with a MongoDB ObjectId, e.g.
  • query the memcached server
  • if we hit, job done!
  • if we miss, query the MongoDB server and update the cache

Alright, let's run this and hit it with some requests :
Mac:~ julien$ curl

Mac:~ julien$ curl

Console output:
Request received, id=51e3ce08915082db3df32bfc
Cache miss, key 51e3ce08915082db3df32bfc. Querying...
Item found: {"_id":"51e3ce08915082db3df32bfc","x":13}
Stored key=51e3ce08915082db3df32bfc, value=13

Memcached stats:
ubuntu@ip-10-234-177-74:~$ echo stats|nc 11211|grep [s,g]et
STAT cmd_get 1
STAT cmd_set 1
STAT get_hits 0
STAT get_misses 1

Let's try the same request again (within 60 seconds!): +1 get, +1 hit!
Request received, id=51e3ce08915082db3df32bfc
Cache hit,  key=51e3ce08915082db3df32bfc, value=13

STAT cmd_get 2
STAT cmd_set 1
STAT get_hits 1
STAT get_misses 1

And 60 seconds later, the memcached item should have disappeared: +1 get, +1 miss, +1 set
Cache miss, key 51e3ce08915082db3df32bfc. Querying...
Item found: {"_id":"51e3ce08915082db3df32bfc","x":13}
Stored key=51e3ce08915082db3df32bfc, value=13

STAT cmd_get 3
STAT cmd_set 2
STAT get_hits 1
STAT get_misses 2

Pretty cool, huh? A basic Node.js + memcached + MongoDB app in less than 100 lines of code, comments and logging included.

However, the really great stuff is what you DON'T see:
  • Node.js automatically handles asynchronous requests and callbacks,
  • Building a memcached cluster will hardly have any impact on the application code, 
  • Building a MongoDB cluster (replica sets, sharding) will be completely transparent.
  • And of course, you can easily create your own pool of Node.js servers and load balance them.
That's A LOT of scalability for free as far as the application developer is concerned.

Food for thought... Something tells me this isn't the last post on these topics. I hope you're enjoying this as much as I am. Time for a drink, I'm exhausted. Cheers!

No comments:

Post a Comment