We recently launched our overhauled i.TV app and aoltv.com. While we took a severe beating from our legacy users for removing some of their favorite features, our technology has held up better than we had hoped.
When we started re-architecting our systems about 10 months ago, we picked some of the newest technologies we could find. We did this mostly because they were fun, and we wanted to try something new, but also because they offer some distinct advantages and scale easily. After playing around with Ruby and
CouchDB, we finally settled on
Node.js and
MongoDB.
Because the technologies are so new, there are a number of things we had to work through to get everything working. Here are a few things about our setup that might help you get to production. This is going to be a ramble, but I figure saying something is better than putting this post off longer.
Use What Makes You Happy
While we had to work through a lot, we more than made up for our losses by using technologies that made us feel productive. They were fun, solved many difficult problems easily, and attracted the right people to our team.
One-step Deployment
Our cluster is hosted entirely on
Rackspace. It's basically EC2, just a little more expensive, but with great support and a better interface. They let you spin boxes up and down as you need them.
We wrote a simple bash script that installs all our software on a box and checks out our main repository. This is checked in to our code base, which keeps our environment and code in sync. We use
node-control to automate our deployments.
One thing to note is that it didn't work to try to plan to automate everything from the beginning. We had to start out doing things by hand, and automate them as we started to repeat ourselves.
Testing
We didn't start out with any tests, but quickly began using
node-async-testing for our unit tests and
expresso for code coverage. We use git hooks to run tests on every push to master. Testing has dramatically reduced our bugs and regressions. We reject the commit if a test fails or coverage goes down.
App Development is a Pain
We decided early on to have our app hit a simple JSON API for its data. It should load data on every page, and take full advantage of HTTP conventions like cache headers to improve performance. This makes for an app that is extremely dependent on our API. While integrating with an API like Netflix is pretty stable, our internal API changed a lot. It was extremely difficult to figure out how to keep versions of our app in sync with our API. Users can take weeks or months to upgrade, and they can't downgrade an app if something goes wrong.
Since rollbacks are nearly impossible, we decided that we would make our API backwards compatible. We try to respect a rule that any URL should return the same format forever, so if you want to change the format, you have to introduce a new URL. If a route is going to be phased out, we simply mark it as deprecated, move it into a special deprecated section of our code base, and delete it a couple months later.
Node in Production
We have our node apps running using
express. We have an init.d script that runs our app using
cluster, which writes the PID file, redirects logs, and daemonizes the process. The apps start on a non-privileged port (like 5000), and we proxy them with nginx. Nginx doesn't do much right now except gzip our JSON.
A Few Servers Go a Long Way (if you cache)
Not including
aoltv.com, we're currently getting about 30M hits a month on our API. Our setup is shown below. Server sizes are shown in RAM (see rackspace for more info). It's holding up just fine with low load averages.
Load Balancer (Varnish - 1 x 4GB) -----> App Servers (Nginx, Node - 1 x 1GB) -------> Database(s) (Mongo - 1 x 4GB)
We cache at every layer. Mongo stores stuff in memory (see below), our app sets a Cache-Control header on each request, which tells varnish how long to cache, and the app itself reads the cache control header to avoid hitting the servers at all. Sticking to http conventions has taken us a long way.
Get Your Mongo into Memory
If you make your MongoDB servers big enough, it automatically stores every record and index in your working set in memory. We estimated our working set as all TV-Guide data for the current day. Later we realized that we could easily fit almost a week's data within 4GB of RAM. While we picked MongoDB partially because it can scale horizontally, they encourage you to scale vertically first. We're nowhere near the vertical limit yet and are doing just fine.
Be Ready to Write Your Own Libraries
We had to roll many of our own libraries for node, including a MongoDB driver, an FTP client, and a host of 3rd-party API wrappers. While many are now redundant, they didn't exist when we needed them, and we couldn't just wait around for them to show up. We've published our
MongoDB driver and hope to publish more soon via our
github account.
Node.js is Easy to Pick Up
While the concepts behind server-side development are quite different from client-side JavaScript, the client-focused JS devs we hired have picked up node very quickly. While it has it's warts (nested async calls anyone?), it's hard to argue against JS when it comes to familiarity.
With MongoDB, Relational is Almost Inevitable
While we tried to make MongoDB non-relational, it almost never worked. TV-Guide data contains several many-to-many relationships, and you simply can't store them as nested objects. For example, a user chooses a Lineup, which lists their available Channels. Each Channel has a list of Events, which map an Episode to a Channel and start time. In theory, we could store the Events underneath the Channel, but that would mean we would have to pull out ALL the Events per Channel to get ANY of them. It ended up being easiest to store Events as separate documents.
What MongoDB really does well, though, is make every JOIN explicit. This encouraged us to denormalize data onto the most specific documents to avoid a second query. For example, by storing the name of the show on each Event, we can avoid having to hit the database a second time to get information about the show. It encourages you to make each document you DO have usable without a second trip to the db.
What Else?
I'd be happy to write more about any of these topics, or any questions.