Oct 1, 2013

Max Schireson (MongoDB) @ Criteo !

Max & Julien

Criteo had a very special guest today! 

Max Schireson, CEO of MongoDB, was kind enough to make a special detour on his European trip just to meet us. How cool is that? Finally a US CEO who realizes that there are intelligent lifeforms on the other side of the Atlantic :D 

Obviously, we had a long discussion about MongoDB and then we just HAD to show Max our now famous rooftop and brag about our world-class view. Grey skies, but hey, we still have the best monuments ;)

Thanks again, Max. We've been very happy to welcome you and we're looking forward to many more fruitful discussions about Criteo and MongoDB!

 



Sep 25, 2013

Viking laws for tech teams, part 7: attack one target at a time

This post is part of the "Viking laws for tech teams" series.


Lack of focus is the single biggest issue a team can face. It usually ranges from "we don't really know what's expected of us" to "we have to work on 10 different things at a time", although the most frequent occurrence is probably "management changed our priorities... again".


This is especially bad when resources are scarce and ain't they always? And how can you keep your team motivated if all they're allowed to do is run in circles, chase one problem after the next and never truly deliver anything but half-baked features or quick and dirty fixes? It's all downhill from there, right? Less productivity, even less achievements, people starting to leave, etc.

"How can my team be so busy and not deliver anything valuable?". Well, that's a fair question and it requires a serious answer, so the first thing to do is to assess the situation. Chopping away at priorities will not fix anything until you understand why the team is underwater. 

In a previous company, I asked every team (20 teams or so, about 130 people total) to describe their workload using three simple numbers:
- "Build": % of time spent designing and building new features
- "Maintain": % of time spent maintaining their applications (bug fixing, production support, etc)
- "Waste": % of time spent on something totally unproductive (meetings, endless arguments over specs or the lack thereof, etc)

A large majority of teams came back with 30/40/30. Not a single one scored over 50% on "Build". Several ones scored 50% on "Waste"... Scary.

Another benefit of this simple assessment is that it may help teams realize how bogged down they are. Some people are convinced that they ARE doing the right thing, especially in startups: "Come on! How could we be so busy and not deliver anything valuable?". Well no, you aren't, you're just chasing your own tail and you've been doing it for so long that you can't realize it anymore.

As a manager, reducing waste is the #1 target you need to attack. Do nothing else until all your teams can spend at least 50% building stuff. Once you reach this milestone, you will have improved morale and productivity. And you will have enough oxygen to start looking at what the real priorities are.

In order to reduce waste, you have to understand exactly what it's made of. Dumpster diving is the only way. Sit down with the teams and let them explain. Everyone should speak, not only the project manager / team leader (sometimes, HE is the problem). My experience is that you'll end up hearing the same things in pretty much every team: too many meetings, too much time wasted on coding/re-coding/re-re-coding the same feature because requirements are fuzzy, poor code quality, no documentation,  lack of proper tools, intrusive management or key stakeholders (wearing the "business emergency" disguise), too many or too few decisions, etc.

Some problems won't be easily fixed (code quality, etc). Cutting down on meetings can and must be done immediately (yes, I hate meetings). Same thing for decisions: you owe your team clarity and stability. Some calls may be above your pay grade, but there's a lot you can decide on your own, so do it, explain why and don't change your mind every other day: don't worry, Marketing will do it for ya ;)

Anything process-related can be improved in a matter of weeks. It's not rocket science: identify key business users, talk to them, explain Agile, start to practice it with them. Make sure that all technical requests are funneled to a limited number of well-identified people in your team, not to random individual developers. You should take the time to explain why this is a better way, but once you've done it, you also need to make sure that people follow the rule and yes, you need to raise your voice every single time someone doesn't.

Do this right and your "Waste" ratio will go down 10-20% very quickly. For a 5-people time working 5 days a week, that's roughly 3-4 additional man-days every week, and hopefully a slightly more pleasant and productive work environment for everyone.

Now that your teams can actually spend more time on building stuff, let's look at priorities. Chances are there's a long list of requests that Marketing-has-been-asking-for-two-years-but-R&D-hasn't-delivered, which is probably very similar to the list of Stuff-that-R&D-will-never-do-because-it's-all-stupid-shit-and-we-have-no-time-anyway. Fair enough. However, there's a big difference between a to-do list and a project portfolio and an even bigger difference between a project portfolio and a product vision.

Setting priorities starts with a product vision, factoring in functional requirements (aka "features") as well as non-functional requirements (aka "architecture, scalability, cost, etc"). Funny how the former is discussed ad nauseam (yeah, my parents made me study Latin for years) and the latter is usually swept under the proverbial rug.

Ah, I hear you saying that this vision thing is all beyond your control and that I'm not helping you at all here. Well, IS it really beyond your control? Have you discussed it with your fellow engineers/team leads? Have you tried raising your ideas to whoever is in charge? It's worth a shot. On top of that, what about your own scope? What is YOUR plan for the next year or so? How will you improve your APIs, your infrastructure, your agility? Don't expect these questions to be answered by anyone else but your own team.

As a more senior manager, you must build a vision and make sure your teams are aligned. A priority list without a vision is meaningless. Within this vision, making sure that non-functional requirements are properly addressed is of paramount importance. YOU build and run the platform, YOU know what needs to be done under the hood.  And don't even bother about other parts of the company not being aligned, you can't do much about it anyway. Take care of your own and hope for the best. 

Another important point: building the vision is crucial, but so is leaving the actual action plan to the team. Do not stand in their way, do not micromanage! Your role is to assign teams their own kill zone and let them roam the land. Trust me, your teams know best. If not, you have a hiring problem and you need to fix it ASAP.

So at this point, you've started to reduce waste and to build your vision. You can now shed a new light on The-Long-List. How valuable is each item? Is it a step towards your vision or not? If not, is there any real business value in building it? In building it right now or 2 months from now? Time to be very critical about this and challenge the business owners! 

As you know, Scrum requires User Stories to be ranked by business value. Defining business value may be easy for customer-oriented features. "Once this customer feed is connected to our platform, we expect to make x thousands of dollars every month, so let's launch this project ASAP": ok, that probably makes sense and it's quite easy to benchmark anyway. Most of the time, it's not that obvious: what business value do these new back office pages bring? Increased productivity? More technical debt? You need to find out.

Architecture work is yet another story. You can't stick to "if we don't do this, we're dead in 3 months", especially if you need to prioritize against proper business features. Getting on top on the backlog will require some extra work to explain why this is really critical and why, yes Mr CxO, you may have to trade short term revenue for long-term scalability.

Building the list of priorities is a never-ending process (aka "backlog grooming"). Don't spend 6 months planning: just make sure you have enough work for the next 3-4 weeks and get going! The future is unknown and trying to figure out what needs to be done 6 months from now is totally useless. 

And remember: for any given team, there can only be one priority #1. If you have 12 priorities for a sprint/quarter/release, then you have none. The only important question is: "what is the highest-impact feature this team can deliver in the next 2 weeks?".

The viking raid, hit-and-run strategy will show real results in little time. It will also build confidence and momentum: the best sign of success is people not talking about The-Long-List anymore and focusing on The-Next-Step instead.

Attack one target at a time with full force, CRUSH IT, move to the next one quickly, repeat. There is no other way.

That's it for today. Till next time, kick ass, my friends.

Sep 20, 2013

MongoDB Motorcycle Club, Paris chapter

Great MongoDB event in Paris yesterday! Thank you to MongoDB and to all attendees for giving me the opportunity to present the work of our teams. Lots of great technical questions, you guys made me burn those breakfast calories ;) Nice to see that the MongoDB techie community is getting stronger and stronger.

See you at the MongoDB User Group hosted by Criteo on September 26th!


Aug 21, 2013

HOWTO: compiling ffmpeg + x264 + MP3 + AAC + Xvid + AMR on Ubuntu 12.04

By popular demand (according to page view stats), I've decided to refresh the Dec'2008 post ("HOWTO: compiling ffmpeg + x264 + MP3 + Xvid + AMR on Ubuntu 8.x") for Ubuntu 12.04. Enjoy :)

ffmpeg is THE audio/video conversion tool. Unfortunately, the default build included in Ubuntu is usually quite outdated, as well as lacking support for many codecs.

The purpose of this article is to show you how you can build a fresh, up to date version of ffmpeg supporting (almost) all major codecs. This procedure was successfully performed on a vanilla Ubuntu 12.04 system.


0) Prerequisites

First, we need to enable the 'multiverse" repository in /etc/apt/sources.list, because that's where some of our external libraries live. All you have to do is
uncomment the line similar to:
add deb http://eu-west-1.ec2.archive.ubuntu.com/ubuntu/ precise multiverse sudo apt-get update

Don't forget to refresh the repos:
$ sudo apt-get update

Then, we need to make sure that git and svn are installed. We'll need them to fetch source code.
$ sudo apt-get install git subversion

Let's also add some mandatory build tools:
$ sudo apt-get install build-essential nasm pkg-config 

Now would be a good time to decide where you're going to build all those sources. Just create a temporary directory anywhere you like (you'll need about 150MB).

The x264 build requires a recent version of the yasm assembler. Let's take care of it:
$ sudo apt-get remove yasm 
$ wget http://www.tortall.net/projects/yasm/releases/yasm-1.2.0.tar.gz 
$ tar xvfz yasm-1.2.0.tar.gz 
$ cd yasm-1.2.0 
$ ./configure --prefix=/usr/local 
$ make 
$ sudo make install 
$ yasm --version
yasm 1.2.0 Compiled on Aug 21 2013. 

Done! One final note before we start building: if you have an existing installation of ffmpeg, you may run into linking issues caused by conflicting library versions. My advice is to remove all existing copies of libav* (libavcodec and so on) which may be present in /usr/lib, either by uninstalling them with APT or by deleting the .a and .so files.


1) Fetching the ffmpeg sources

First of all, let's get the latest ffmpeg source:
$ git clone git://source.ffmpeg.org/ffmpeg.git ffmpeg

Of course, you could just go ahead with configuremakemake install and be done with it. Unfortunately, it's not that simple. Go ahead, run configure:

$ cd ffmpeg
$./configure 
--prefix=/usr/local
output removed
Creating config.mak and config.h...

Take a closer look at the output, especially at the 'Enabled encoders' section. A number of major formats, like AAC, MP3, x.264 or XViD are missing. Can you live without these? Probably not...

Why, oh why are they missing? Take another look at the output of the configure command:
libfaac enabled no
libmp3lame enabled no
libx264 enabled no
libxvid enabled no

These encoders are missing because they're handled by external libraries which are not part of the ffmpeg source package. Chasing them all is a bit of a pain in the #$$ and hopefully this article will help!

2) Configuring ffmpeg... and failing

Let's go wild and enable these additional encoders: AAC, MP3, Theora, Vorbis, x264, xvid, AMR, GSM. Let's also build shared libraries, which are useful if you're running multiple copies of ffmpeg:

./configure --prefix=/usr/local --enable-gpl --enable-version3 --enable-nonfree --enable-shared --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libfaac --enable-libgsm --enable-libmp3lame --enable-libtheora --enable-libvorbis --enable-libx264 --enable-libxvid 

It will inevitably lead to something like:
ERROR: libfaac not found

That's ok, we haven't installed the external libraries required by our ffmpeg build. Let's get to it!

3) Installing AAC, MP3, Theora, Vorbis, AMR, GSM

These are readily available in the Ubuntu repository, let's add them:
$ sudo apt-get install libfaac-dev libmp3lame-dev libtheora-dev libvorbis-dev libopencore-amrnb-dev libopencore-amrwb-dev libgsm1-dev zlib1g-dev libgpac1-dev

4) Installing libx264

x264 is a free library for encoding H264/AVC video streams. It can be installed with APT using 'apt-get install libx264-dev' but let's make sure we have both the latest ffmpeg and the latest x264.

So, let's fetch the x264 sources and build them:
git clone git://git.videolan.org/x264.git
$ cd x264

$ ./configure --prefix=/usr/local --enable-shared
make
$ sudo make install


5) Installing libxvid

Let's fetch the xvid sources and build them:
$ wget http://downloads.xvid.org/downloads/xvidcore-1.3.2.tar.gz 
$ tar xvfz xvidcore-1.3.2.tar.gz 
$ cd xvidcore-1.3.2/build/generic 
$ ./configure --prefix=/usr/local 
$ make
$ sudo make install

6) Configuring ffmpeg... and succeeding!

We should have everything we need now. Let's try that configure command again:
./configure --prefix=/usr/local --enable-gpl --enable-version3 --enable-nonfree --enable-shared --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libfaac --enable-libgsm --enable-libmp3lame --enable-libtheora --enable-libvorbis --enable-libx264 --enable-libxvid
output removed
Creating config.mak, config.h, and doc/config.texi...

All right, let's build it. Time for coffee(s), this is a rather long build:
$ make
$ sudo make install
$ sudo ldconfig -v

Cool, huh? Now, let's check this new ffmpeg:
which ffmpeg
/usr/local/bin/ffmpeg

$ ffmpeg -encoders|grep -E "mp3|xvid|aac|gsm|amr|x264|theora|vorbis"
ffmpeg version N-55691-gca7f637 Copyright (c) 2000-2013 the FFmpeg developers
  built on Aug 21 2013 15:44:17 with gcc 4.6 (Ubuntu/Linaro 4.6.3-1ubuntu5)
  configuration: --prefix=/usr/local --enable-gpl --enable-version3 --enable-nonfree --enable-shared --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libfaac --enable-libgsm --enable-libmp3lame --enable-libtheora --enable-libvorbis --enable-libx264 --enable-libxvid
  libavutil      52. 42.100 / 52. 42.100
  libavcodec     55. 29.100 / 55. 29.100
  libavformat    55. 14.100 / 55. 14.100
  libavdevice    55.  3.100 / 55.  3.100
  libavfilter     3. 82.100 /  3. 82.100
  libswscale      2.  5.100 /  2.  5.100
  libswresample   0. 17.103 /  0. 17.103
  libpostproc    52.  3.100 / 52.  3.100
 V..... libx264              libx264 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 (codec h264)
 V..... libx264rgb           libx264 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 RGB (codec h264)
 V..... libxvid              libxvidcore MPEG-4 part 2 (codec mpeg4)
 V..... libtheora            libtheora Theora (codec theora)
 A..X.. aac                  AAC (Advanced Audio Coding)
 A..... libfaac              libfaac AAC (Advanced Audio Coding) (codec aac)
 A..... libopencore_amrnb    OpenCORE AMR-NB (Adaptive Multi-Rate Narrow-Band) (codec amr_nb)
 A..... libgsm               libgsm GSM (codec gsm)
 A..... libgsm_ms            libgsm GSM Microsoft variant (codec gsm_ms)
 A..... libmp3lame           libmp3lame MP3 (MPEG audio layer 3) (codec mp3)
 A..X.. vorbis               Vorbis
 A..... libvorbis            libvorbis (codec vorbis)

Congratulations, your brand new ffmpeg can now encode AAC, MP3, Theora, Vorbis, x264, xvid, AMR and GSM :)

7) Cleaning up (optional)
If like me you keep building the latest version, you will eventually end up with a lot of unnecessary libraries in /usr/local/lib. There's nothing really wrong with old versions, but for the sake of clarity, you may want to remove the old shared libraries, i.e. the ones NOT linked as lib*.so.

That's it for today :)

Arduino: LCD thermometer

I am truly and utterly in love with the Arduino project. There is no simpler and easier way to learn both programming and electronics. Believe me, the smartest gift for a 12-year old kid today is a C beginner book ("Beginning C with Arduino" is a perfect fit: read and approved!), an Arduino starter kit and some guidance to get going. If this had been around 30 years ago, a lot of innocent components would have been spared an untimely death at my n00b hands :D

I usually practice what I preach, so my own kids (10 and 12) are on their way. I can't believe how fast they learn, you should see reading schematics, wiring at light speed and correcting my fat finger mistakes: "Dad, you got this one wrong, it must go to the ground". Awesome.

Anyway, here's a cool little project I've built once I could actually lay my hands on the board again. This is a mix between two projects included in the starter kit, i.e. the Love-o-meter (temperature sensor) and the Crystal Ball (LCD screen). Fritzing schematic and code are on GitHub.

The TMP36 temperature sensor is connected to analog input A0. Its value is read every second by the program. The LCD screen is wired like in the Crystal Ball project, with a potentiometer to adjust brightness.

To avoid displaying "jumpy" temperatures, I'm storing and averaging the last 256 samples. This is implemented using a simple array-based circular buffer. Once the average temperature has been computed, it's displayed on the LCD screen.

A couple more things:

  • I'm defining a custom character (°), using a 5x7 bitmap.
  • I'm using dtostrf() to convert a float into a string. Normally, sprintf() would be used, but the Arduino library doesn't provide float support in sprintf().

Here's the full code.



Till next time... Hack at the world!

Aug 17, 2013

Node.js, part 5.1: don't you C?

Leslie Lamport once said (before many of you were even born): "You know you have a distributed system when the crash of a computer you've never heard of stops you from getting any work done". The Internet sure proved him right, as demonstrated again last night by Google and their global worldwide crash...

However, when distributed systems work as intended, interesting things do happen. For instance, if one was so inclined, one could reimplement a server in a different language while preserving interfaces and things-would-still-work.

And so, as I sipped my morning coffee, I became very much fucking inclined indeed to rewrite yesterday's Node.js zeromq server in C. Call it my stupid idea of the day... or just an interesting comparison of two pieces of code standing 40 years apart :) In the real world, one good reason to do this would be to improve performance of critical parts of your application: divide, rewrite, conquer!

Here's the ANSI89-compliant code (also on Github), the only deviance being the use of snprintf() which only made it into C99 :)



Easy to build (no Makefile, no Maven and its XML crap):
ubuntu@ip-10-35-165-93:~$ gcc -ansi -pedantic -Wextra -Wall -o worker worker.c -lzmq
worker.c: In function ‘main’:
worker.c:29:2: warning: format ‘%d’ expects argument of type ‘int’, but argument 3 has type ‘size_t’ [-Wformat]
worker.c:34:2: warning: implicit declaration of function ‘snprintf’ [-Wimplicit-function-declaration]

Let's start this C worker and hit it with some requests coming from our Node.js app:
ubuntu@ip-10-35-165-93:~$ ./worker 
Worker ready on tcp://*:3000
received some stuff 15, size=2
sent result 30
received some stuff 8, size=1
sent result 16

Nice. In all fairness, yesterday's Node.js code is quite easier to read. I did add more error management clutter in the C program, but doing without it is not an option. Love them or hate them, modern languages will spit a stack trace out when they crash: most of the time, this is enough to start investigating. On the other hand,  C will say : "core dumped" and leave you crying all the way home: so yes, you HAVE to write more C code to get roughly the same level of information. You can also see the immense power of toString() and dynamic typing, although the latter does introduces huge caveats.

So, what did we learn? Breaking down your app not only allows you to scale it piece by piece: it also gives you the freedom to rewrite critical parts, whatever your motivation is. To me, this is true modularity, something that I've rarely seen in so-called modular architectures based on Java or C#. Having classes is nice, but services are what you really need.

And yeah, for ultimate control and performance, C cannot be defeated. I'm quite scared to see it disappear from many curriculums. Kids, listen to the old man: you must learn C. You'll thank me later.

End of sermon. Now go in peace and hack at the world!

Aug 16, 2013

Node.js, part 5: Zero the hero!

And you thought I was on holiday?

Time to hack our Node app again and introduce another great piece of software called zeromq. On top of having one of the coolest logos, zeromq is a high-performance, low-latency messaging system that you simply cannot afford to ignore when you need to build highly scalable apps (IMHO). Other contenders would be RabbitMQ (awesome and rock-solid, but not really lightweight) and NSQ (written by the bit.ly folks).

The basic idea will be to have our web app call a remote server through zeromq. The remote server will perform a super fancy calculation and return it back to the web app. I can feel that you're excited already, so let's get started :*)

Both our zeromq machines (the client and the server) require a little bit of setup: on top of Node.js and npm (as described in previous posts), we need to install the zeromq library. Once again, the one available in the Ubuntu repo is way too old, so let's build a fresh version:

$ wget http://download.zeromq.org/zeromq-3.2.3.tar.gz
$ tar xvfz zeromq-3.2.3.tar.gz
$ cd zeromq-3.2.3
$ ./configure
make
$ sudo make install

Now, let's add the zeromq bindings for Node.js, namely the zeromq.node library aka zmq:

$ sudo apt-get install build-essential (required by the zmq install)
$ npm install zmq

Now we're ready to roll. Let's keep it simple and use a nice request-reply pattern: the web app will send a request to the remote server and get a response back. Zeromq supports many other patters, you'll find more info in the official documentation. Definitely worth a read, zeromq is a superb piece of software engineering art. Lightweight, simple, elegant... ah, love at first sight.

But I digress (that fine bottle of Leffe, probably). Let's look at our zeromq server first.

BTW, since I guess it's time to stop mourning SCCS ;), I've created a proper Github account where you'll find all pieces of code posted on this blog. Can't say I love Git dearly, but a little progress every now and then is okay, I guess.



Nice and short, huh? Create a 'reply' socket, listen on port 3000/TCP and send back twice was we got. And yes, everything in life is an 'int', so there.

Now, let's look at the corresponding part in the client, i.e. our web app:



Just as simple. Create a 'request' socket, bind it to our remote server and, in a separate function, fire some data, read the answer and display it.

Here's the full code:



Logs (or it didn't happen, right?): display all entries and get one id twice

local0.info<134>: Aug 16 17:29:52 10.37.166.19 {"message":"*** NODE APP STARTED ***"}
local0.info<134>: Aug 16 17:29:52 10.37.166.19 {"message":"Connected to zeromq server"}
local0.info<134>: Aug 16 17:29:52 10.37.166.19 {"message":"Web server started"}
local0.info<134>: Aug 16 17:29:52 10.37.166.19 {"message":"Connected to memcache"}
local0.info<134>: Aug 16 17:29:52 10.37.166.19 {"message":"Connected to database"}
local0.info<134>: Aug 16 17:30:04 10.37.166.19 {"message":"Request received, id=0"}
local0.info<134>: Aug 16 17:30:04 10.37.166.19 {"message":"Finding all documents"}
local0.info<134>: Aug 16 17:30:12 10.37.166.19 {"message":"Request received, id=51e3ce08915082db3df32c08"}
local0.info<134>: Aug 16 17:30:12 10.37.166.19 {"message":"Cache miss, key 51e3ce08915082db3df32c08. Querying..."}
local0.info<134>: Aug 16 17:30:12 10.37.166.19 {"message":"Item found: {\"_id\":\"51e3ce08915082db3df32c08\",\"x\":25}"}
local0.info<134>: Aug 16 17:30:12 10.37.166.19 {"message":"Zeromq request: 25"}
local0.info<134>: Aug 16 17:30:12 10.37.166.19 {"message":"Zeromq response: 50"}
local0.info<134>: Aug 16 17:30:12 10.37.166.19 {"message":"Stored key=51e3ce08915082db3df32c08, value=25"}
local0.info<134>: Aug 16 17:30:15 10.37.166.19 {"message":"Request received, id=51e3ce08915082db3df32c08"}
local0.info<134>: Aug 16 17:30:15 10.37.166.19 {"message":"Cache hit,  key=51e3ce08915082db3df32c08, value=25"}
local0.info<134>: Aug 16 17:30:15 10.37.166.19 {"message":"Zeromq request: 25"}
local0.info<134>: Aug 16 17:30:15 10.37.166.19 {"message":"Zeromq response: 50"}

Let's take a few steps back and look at what we built over the last posts:


  • a simple Node.js web app...
  • fetching data from an Elasticache cluster...
  • fed by a MongoDB server...
  • logging remotely to a syslog server...
  • calling a remote server through zeromq...
  • all of it open-source...
  • all of it cloud-based...
  • each part easily scalable, independently from the others.
And above everything, very little code: 159 lines (comments included), with the help of a few well-chosen libraries. 

Makes me want to be a developer again. Food for thought ;)

Till next time, my friends.

PS: In case you were wondering "Zero the hero" is a Sabbath song, although one could argue that the Cannibal Corpse cover is actually quite better.

Aug 5, 2013

Node.js, part 4: the Big Kahuna syslog!

To paraphrase the immortal Samuel Jackson in Pulp Fiction: "Logs! The cornerstone of any nutritious application!". And my Node.js app doesn't have any! As Mr Brando once said "Ah, the horror, the horror!". At the time, he probably didn't worry about log management, but the quote is definitely relevant for that too :D

I'm an old fashioned guy, so the first thing that comes to my mind is : "syslog". We could certainly find fancier ways (yes, I'm looking at you, scala-*), but syslog will do for now.

Once again, I'll use an Ubuntu 12.04 instance in EC2, which happens to have a running rsyslogd :) Let's take a look at the main config file, /etc/rsyslogd.conf.

We'll use UDP to ship our logs. However, because of this unbelievable 2-year old bug, we can't use port 514 (the default port for syslog): we'll use 5140 instead, don't forget to open 5140/UDP in the security group of the EC2 instance!

# provides UDP syslog reception
$ModLoad imudp
$UDPServerRun 5140

Let's now look at the default rules, located in /etc/rsyslog.d/50-default.conf. Let's keep it very simple and make sure we log everything. We'll also add a new template, because the default one misses some useful information, such as priority (duh!).

$template TraditionalFormatWithPRI,"%pri-text%: %timegenerated% %HOSTNAME% %syslogtag%%msg:::drop-last-lf%\n"
*.*    /var/log/messages;TraditionalFormatWithPRI

I hear my sysadmin team booing the ugliness of this setup. Yeah, I love you too, bitchez :-P

Anyway, this should be enough, now we just need to restart rsyslogd:
ubuntu@ip-10-34-245-176:~$ sudo service rsyslog restart
rsyslog stop/waiting
rsyslog start/running, process 4639

Now, let's move back to our Node.js instance and try to talk to our syslog server:
ubuntu@ip-10-37-166-19:~$ echo "<14>Node sez: I see you"|nc -u -v -w0 10.34.245.176 5140

If you tail /var/log/messages on the syslog server, the last line should say:
user.info<14>: Aug  5 14:40:18 Node sez: I see you

OK. Now that we have a running syslog server, how is the Node.js app going to send it data? The good news is that there's a large number of packages to do this. The bad news is, most of them don't seem to work very well... 

Ain2 worked right out of the box. If you're only looking for a syslog module, I would recommend this one. However, I ended up picking winston, an interesting logging framework supporting many different transports  (file, http, several databases, etc) including syslog through the winston-syslog module.

In theory, at least. For the life of me, I couldn't get this thing to work, despite a lot of creative hacking and creative swearing too. Fortunately, winston-syslogUdp works: "It was created when the authors experienced challenges with winston-syslog". Yeah, tell me about it! Good job, guys.

winston requires a newer version of Node.js than the one available in the Ubuntu repo, so let's take care of this upgrade first:
ubuntu@ip-10-37-166-19:~$ sudo add-apt-repository ppa:chris-lea/node.js
ubuntu@ip-10-37-166-19:~$ sudo apt-get update
ubuntu@ip-10-37-166-19:~$ sudo apt-cache show nodejs|grep Version
Version: 0.10.15-1chl1~precise1
Version: 0.6.12~dfsg1-1ubuntu1
ubuntu@ip-10-37-166-19:~$ sudo apt-get install nodejs=0.10.15-1chl1~precise1
ubuntu@ip-10-37-166-19:~$ node -v
v0.10.15

Now, let's update the Node.js packages and install the winston packages:
ubuntu@ip-10-37-166-19:~$ npm update
ubuntu@ip-10-37-166-19:~$ npm install winston winston-syslogudp

After some more hacking and swearing (but no drinking, I promise), I adapted my program to use winston, with the following setup:
  • don't log to the console 
  • log to a local file
  • log to a remote syslog server
Here how you do this, it's really simple:
// Don't log to console
winston.remove(winston.transports.Console);
// Log to local file
winston.add(winston.transports.File, { filename: 'log.txt' });
// Log to remote syslog
winston.add(winston.transports.Syslog, {"host":"10.34.245.176", "port":"5140", "localhost":"10.37.166.19"});

This is quite self-explanatory, as you will see below. However, winston can also do some pretty advanced stuff, I encourage you to read the documentation and go log-crazy ;)

While I was at it, I cleaned the code a little bit and also solved the MongoDB ObjectId validation issue like this. Not sure if it's the canonical way, but it works.
// Check that the ObjectId is valid
try {
    oid = new require('mongodb').ObjectID.createFromHexString(query.id);
}
catch (err) {
    winston.info("Invalid OID, err="+err);
    response.end();
    return;
}

Here's the full code.



Let's run this, hit it with some requests and look at the logs :)

ubuntu@ip-10-37-166-19:~$ node www5.js &
ubuntu@ip-10-37-166-19:~$ tail -f log.txt 
{"level":"info","message":"*** NODE APP STARTED ***","timestamp":"2013-08-05T15:07:36.678Z"}
{"level":"info","message":"Web server started","timestamp":"2013-08-05T15:07:36.823Z"}
{"level":"info","message":"Connected to memcache","timestamp":"2013-08-05T15:07:36.850Z"}
{"level":"info","message":"Connected to database","timestamp":"2013-08-05T15:07:36.876Z"}

ubuntu@ip-10-34-245-176:~$ tail -f /var/log/messages 
local0.info<134>: Aug  5 15:07:34 10.37.166.19 {"message":"*** NODE APP STARTED ***"}
local0.info<134>: Aug  5 15:07:34 10.37.166.19 {"message":"Web server started"}
local0.info<134>: Aug  5 15:07:34 10.37.166.19 {"message":"Connected to memcache"}
local0.info<134>: Aug  5 15:07:34 10.37.166.19 {"message":"Connected to database"}

GET http://ec2-54-247-53-161.eu-west-1.compute.amazonaws.com:8080/?id=0
local0.info<134>: Aug  5 15:11:51 10.37.166.19 {"message":"Request received, id=0"}
local0.info<134>: Aug  5 15:11:51 10.37.166.19 {"message":"Finding all documents"}

GET http://ec2-54-247-53-161.eu-west-1.compute.amazonaws.com:8080/?id=51e3ce08915082db3df32bf7
local0.info<134>: Aug  5 15:12:19 10.37.166.19 {"message":"Request received, id=51e3ce08915082db3df32bf7"}
local0.info<134>: Aug  5 15:12:19 10.37.166.19 {"message":"Cache miss, key 51e3ce08915082db3df32bf7. Querying..."}
local0.info<134>: Aug  5 15:12:19 10.37.166.19 {"message":"Item found: {\"_id\":\"51e3ce08915082db3df32bf7\",\"x\":8}"}
local0.info<134>: Aug  5 15:12:19 10.37.166.19 {"message":"Stored key=51e3ce08915082db3df32bf7, value=8"}

The same one, right away:
local0.info<134>: Aug  5 15:12:21 10.37.166.19 {"message":"Request received, id=51e3ce08915082db3df32bf7"}
local0.info<134>: Aug  5 15:12:21 10.37.166.19 {"message":"Cache hit,  key=51e3ce08915082db3df32bf7, value=8"}

The same one, more than 60 seconds later :)
local0.info<134>: Aug  5 15:14:15 10.37.166.19 {"message":"Request received, id=51e3ce08915082db3df32bf7"}
local0.info<134>: Aug  5 15:14:15 10.37.166.19 {"message":"Cache miss, key 51e3ce08915082db3df32bf7. Querying..."}
local0.info<134>: Aug  5 15:14:15 10.37.166.19 {"message":"Item found: {\"_id\":\"51e3ce08915082db3df32bf7\",\"x\":8}"}
local0.info<134>: Aug  5 15:14:15 10.37.166.19 {"message":"Stored key=51e3ce08915082db3df32bf7, value=8"}

Now, let's try a bogus ObjectID:
GET http://ec2-54-247-53-161.eu-west-1.compute.amazonaws.com:8080/?id=666
local0.info<134>: Aug  5 15:17:22 10.37.166.19 {"message":"Request received, id=666"}
local0.info<134>: Aug  5 15:17:22 10.37.166.19 {"message":"Invalid OID, err=Error: Argument passed in must be a single String of 12 bytes or a string of 24 hex characters"}

Woohoo! This doesn't crash the app anymore. By the way, if you use Chrome, you certainly see a lot of annoying lines saying:
local0.info<134>: Aug  5 15:14:15 10.37.166.19 {"message":"Request received, id=undefined"}
You can ignore them: this is just Chrome trying over and over and over again to get the favorite icon (favicon.ico)... This is a known bug.

And so now, our web app haz logz and at 131 lines of code (comments included), it's not bloated either. Pretty amazing.

That's it for today. Keep hacking :)

Aug 3, 2013

Node.js + MongoDB, part 3: exit memcached, enter ElastiCache

In a previous post, we added caching to our Node.js + MongoDB web application, thanks to an EC2 instance running memcached.

As you may know, AWS support its own brand of caching, aka ElastiCache: "Amazon ElastiCache is protocol-compliant with Memcached, a widely adopted memory object caching system, so code, applications, and popular tools that you use today with existing Memcached environments will work seamlessly with the service".

Seamlessly? Hmmm. Alright, let's take a look :)

First, let's use the AWS console to create an ElastiCache cluster. We'll start small, with 1 single micro node supporting a little over 200MB of cache. More than we need... and for free :)

Wait a couple of minutes and the cluster will come online. Let's check that the single node is accessible from our EC2 instance (no need to mess with the security group, port 11211 is open by default):
ubuntu@ip-10-48-249-35:~$ echo stats| nc memcache.9vficl.0001.euw1.cache.amazonaws.com 11211

This looks like memcached stats alright. Ok, now let's change a single line in our app, to use ElastiCache instead of our homemade instance:

// MemcacheClient = new mc.Client("10.234.177.74");
MemcacheClient = new mc.Client("memcache.9vficl.0001.euw1.cache.amazonaws.com");

Now, let's run the app:
ubuntu@ip-10-48-249-35:~$ node www4.js 
Web server started
Connected to memcache
Connected to database

Nice. Time for queries and console output!
Mac:~ julien$ curl http://ec2-54-216-3-139.eu-west-1.compute.amazonaws.com:8080/?id=51e3ce08915082db3df32bfc

Request received, id=51e3ce08915082db3df32bfc
Cache miss, key 51e3ce08915082db3df32bfc. Querying...
Item found: {"_id":"51e3ce08915082db3df32bfc","x":13}
Stored key=51e3ce08915082db3df32bfc, value=13

Let's replay the request quickly:
Request received, id=51e3ce08915082db3df32bfc
Cache hit,  key=51e3ce08915082db3df32bfc, value=13

And once more after 60 seconds (the item TTL):
Request received, id=51e3ce08915082db3df32bfc
Cache miss, key 51e3ce08915082db3df32bfc. Querying...
Item found: {"_id":"51e3ce08915082db3df32bfc","x":13}
Stored key=51e3ce08915082db3df32bfc, value=13

Let's stop the web app and look at the cache stats:
ubuntu@ip-10-48-249-35:~$ echo stats| nc memcache.9vficl.cfg.euw1.cache.amazonaws.com 11211|grep [g,s]et
STAT cmd_get 3
STAT cmd_set 2
STAT cmd_config_get 134
STAT cmd_config_set 1
STAT get_hits 1
STAT get_misses 2

God knows that 'seamlessly' is one of the most overused words in the IT business, but I've got to admit, this is truly seamless :)

As usual with AWS, you also get a bunch of graphs out of the box. That's a nice plus.


So that's the "cache" part of ElastiCache and it does what it's supposed to do. What about the 'elastic' part'?

Adding more nodes to the cluster is literally one click away: "Add node" :) After a few minutes, the new node is added to cluster.

Let's update our web app and relaunch it:
// MemcacheClient = new mc.Client("10.234.177.74");
MemcacheClient = new mc.Client(["memcache.9vficl.0001.euw1.cache.amazonaws.com", "memcache.9vficl.0002.euw1.cache.amazonaws.com"]);

ElastiCache has its own client, which allows the app to auto-discover new nodes without having to restart. Very nice, but the client is only available for Java (grumble) and PHP (bleh). Hopefully other languages are on the way.

Now, how about we hit the 2-node cluster with a little more traffic? This ugly script will do the trick:



Let's run this in 2 or 3 local shells and wait for a little while. Coffee break :)

Alright, what do the graphs say?






In the beginning, we see some traffic on node 1 only, then some on node2 only, as I tested them independently. Then, I updated the app to support both nodes and I restarted it. As you can see, traffic is pretty well balanced between the two servers, which means that the mc client works :)

One last thing to conclude: going back to the web app console output, we can clearly see the parallel asynchronous nature of Node.js at work.

Stored key=51e3ce08915082db3df32bfd, value=14
Stored key=51e3ce08915082db3df32bff, value=16
Stored key=51e3ce08915082db3df32bfe, value=15
Request received, id=51e3ce08915082db3df32bf0
Request received, id=51e3ce08915082db3df32bf2
Request received, id=51e3ce08915082db3df32bf1
Cache hit,  key=51e3ce08915082db3df32bf0, value=1
Cache hit,  key=51e3ce08915082db3df32bf2, value=3
Cache hit,  key=51e3ce08915082db3df32bf1, value=2
Request received, id=51e3ce08915082db3df32bfb
Cache hit,  key=51e3ce08915082db3df32bfb, value=12
Request received, id=51e3ce08915082db3df32bfd
Request received, id=51e3ce08915082db3df32bfc
Cache hit,  key=51e3ce08915082db3df32bfc, value=13
Request received, id=51e3ce08915082db3df32bf4
Request received, id=51e3ce08915082db3df32bf3
Cache hit,  key=51e3ce08915082db3df32bf3, value=4
Request received, id=51e3ce08915082db3df32bf5
Cache hit,  key=51e3ce08915082db3df32bfd, value=14

The two lines I've highlighted above come from the processing of the same HTTP request. However, a lot of stuff happened in the middle because other requests. Did I create threads? No. Did I mess with 'worker pools' or similar atrocities? No. I just wrote my code and let the framework do the rest.

That's it for today. Next time, we'll look at... ah, you'll see :)

Happy hacking.