Tuesday, May 25, 2010

Dealing with High Traffic Sites - my notes distilled

This post was inspired by recent crashes of sites like radiojamaica.com (they seem to have found their feet again though), the Trinidad Guardian and Express during the Trinidad and Tobago elections.
Disclaimer: this is based on research and "on the ground" experience. I'm sure there are persons who have significantly more experience than me.
If these strategies seem overwhelming, then grab an expert to implement them for you. Also expect to have to spend a bit.
The following are server administration strategies for dealing with high traffic sites. This hardly has anything to do with what CMS you use, the principles are the same, this has to do with proper deployment for high traffic sites. If you have to deal with significant traffic here are some of the things you'll want to do:

Big Picture
1. Start monitoring immediately (assuming you aren't measuring already), get as much data as you can on memory usage, load etc...
2. Throw hardware at it (more memory etc...)

Strategies
Here are the important strategies:
  • load balancing
  • monitoring (monitor memory usage, monitor spikes etc...)
  • sql sharding (breaking up your database) (see: http://www.slideshare.net/RockeTier/how-sharding-turned-mysql-into-the-internet-defacto-database-standard-moshe-kaplan-rocketier)
  • go asynchronous (do like a flickr engineer, queue whatever can be queued)
  • serve as much as you can statically, pictures, files, videos
  • caching - I recommend Varnish (I put this last, because it shouldn't be used to gloss over underlying problems. Pun intended)

update (June 28, 2010): Check for misconfigurations. The worst mistake I made recently was pointing my cachesetup at the wrong port for the my proxy cache, so of course none of the sites were actually benefitting from the caching.

No comments:

Sign up for my upcoming Plone 5 Book & Video tutorials

plone 5 for newbies book and videos