Major sites not conserving bandwidth with gzip content compression

At GreatSchools we do around 1M real page views per day and another 250k or so for crawlers. Before content compression we were running well in excess of 10Mbit/s during peak hours and were getting hit with bursting charges on high traffic months. When we switched our proxy servers to Apache with mod_deflate (gzip based compression) we saw a 35% decrease in bandwidth utilization and the 3 proxy servers that do the compression and sit in front of our 10 web servers barely register a load at all.

Our average page size (on the first page view) clocks in around 80-90K. However, compare that with your average Web 2.0 company like Digg or 37signals using prototype, scriptaculous, lightbox, and so on and you’ll often see a total page size closer to 200K and remarkably, these companies are not doing gzip compression! In fact many major sites are not yet realizing the benefits of content compression, however, some such as CNN, MySpace, and Slashdot have already caught on!

If you’re running a site that does some traffic you owe it to yourself to look into content compression of HTML, CSS, and Javascript with mod_deflate or equivalent, the bandwidth savings can be tremendous! You can click here to check if your site is already using gzip based content compression.

This entry was posted in Systems Administration, Web. Bookmark the permalink.

3 Responses to Major sites not conserving bandwidth with gzip content compression

  1. Aleksey Gopachenko says:

    The problem is that IE (dominating browser) handles gzip *really* badly (no caching, errors).

  2. Todd Huss says:

    Hi Aleksey, while there have been some bugs with Internet Explorer and compression in the past we haven’t seen any issues with it.

    We have several million unique users a month that complain when things break and so far not a peep. Also, with massive sites like CNN, Slashdot, and MySpace doing compression, I’m hard pressed to accept the argument that compression introduces major problems for IE users.

  3. Aleksey Gopachenko says:

    Your site may not have problems noticeable to users. But you comparing with “Web 2.0” site – and I have quite a practical expirience in this area – gzipped JS and gzip + XHR is a bad idea in IE. Also IE will not cache gzipped content under some (not so rare) conditions, thus effectively diabling you from any traffic “savings”. You probably may carefully design Web 2.0 site with gzip compression in mind… but I don’t really see why.
    While my first page is ~200k with 80% of that is library code (dojo), further updates range from 100 to 2000 bytes. System *feels* very responsive even through very slow connections. And the session traffic from typical user is about 10 to 80 times less.

Leave a Reply

Your email address will not be published. Required fields are marked *